MIP* = RE 


Zhengfeng Ji*!, Anand Natarajan'**, Thomas Vidick**, John Wright?34, and Henry Yuen! 


l Centre for Quantum Software and Information, University of Technology Sydney 
2 Institute for Quantum Information and Matter, California Institute of Technology 
3 Department of Computing and Mathematical Sciences, California Institute of Technology 
4 Department of Computer Science, University of Texas at Austin 
5 Department of Computer Science and Department of Mathematics, University of Toronto 


November 7, 2022 


Abstract 


We show that the class MIP* of languages that can be decided by a classical verifier interacting 
with multiple all-powerful quantum provers sharing entanglement is equal to the class RE of recursively 
enumerable languages. Our proof builds upon the quantum low-degree test of (Natarajan and Vidick, 
FOCS 2018) and the classical low-individual degree test of (Ji, et al., 2020) by integrating recent devel- 
opments from (Natarajan and Wright, FOCS 2019) and combining them with the recursive compression 
framework of (Fitzsimons et al., STOC 2019). 

An immediate byproduct of our result is that there is an efficient reduction from the Halting Problem 
to the problem of deciding whether a two-player nonlocal game has entangled value 1 or at most 5 
Using a known connection, undecidability of the entangled value implies a negative answer to Tsirelson’s 
problem: we show, by providing an explicit example, that the closure Cga of the set of quantum tensor 
product correlations is strictly included in the set Cgc of quantum commuting correlations. Following 
work of (Fritz, Rev. Math. Phys. 2012) and (Junge et al., J. Math. Phys. 2011) our results provide a 
refutation of Connes’ embedding conjecture from the theory of von Neumann algebras. 


2001.04383v3 [quant-ph] 4 Nov 2022: 


e 
e 


‘arXiv 


“Email: zhengfeng.ji@uts.edu.au 
‘Email: anandn @caltech.edu 
*Email: vidick@caltech.edu 
SEmail: wright @cs.utexas.edu 
IEmail: hyuen@cs.toronto.edu 


Contents 


Introduction 
1.1 Interactive proof systems .... 2... . a 
1.2 Statement ofresult 2... aceae dr aa a ee ee eee 
1.3 Consequences ... 6. ea SG ba eau a ee EE Ee ee ee ees 
1.4 Open questions .. . ooa ee 
Proof Overview 
Preliminaries 
3:1- Turing machines «ss ss dor erer eaka nesa ee 
3:2 Linear spaces: sec saiu ea EE a pla ee a Re ee ee ey Sere 
3:3. Finitefields s eeens sa & a eS ee ee oe GS Raa RR A RD EE ER Hee a 
3.3.1 Subfields and bases... 2... 2... .. 0. 2 ee 
3.3.2 Bit string representations... .. 2... 2... ee 
3.4 Polynomials and the low-degree code ..... 2.2... 0.0.0.0 000 ee ee eee 
3.5 Linear spaces and registers .. 2... ee 
3.6 Measurements and observables .... oaoa aaa ee 
3.7 Generalized Pauli observables .... 2... 2... 0.00. ee ee 


Conditionally Linear Functions, Distributions, and Samplers 
4.1 Conditionally linear functions and distributions .....................04.4 
4.2 Conditionally linear samplers... 2... 2... 


Nonlocal Games and MIP* 


5.1 Games and strategies . . .. ooo ee 
5:2- Distance Measures” aos gv okap wah a wi a a De ee Pee ae ee Bad ae 
5:3 Theclass WIP” wi gaa gt See ae ete ee bem ad a i) Wh ho he rs 
5.4" Normal form verifiéts:: 262-40 6 aede ee a a a we he es 
Types 
6.1 Typed samplers, deciders, and verifiers . . . 2... ... 2.2... 000.000.0000 2 0. 
6.2 Graph distributions ... 2... 0.0.0.0... 00 2 ee 
6.3 Detyping typed verifiers 2.2... 2... 2... ee 
Classical and Quantum Low-degree Tests 
7.1 The classical low-degree test... 2... ee 
WAL The-pameé s o sa s irc Gb wae aay ed EE Pew bo ba hae ee od ea 
7.1.2 Complexity of the classical low-degree test... 2... 2 eee 
7.2 The Magic Square game ........... 0... ee ee 
73°. he Pauli basis:test. ccs. ach s mi ee a Dee, a eB Oe ed 
eo “Phe camer. e ana a Seve. aise, Geeks aye Ae es ele, A ay ede ee a ee ee E 
7.3.2 Canonical parameters and complexity of the Pauli basis test ............. 


8 


10 


11 


12 


Introspection Games 


8.1. “OVELVIEW aude elev ae eae Ped ba we da 2h eae waa aaa fe bb ed ee ee 
8.2 The introspective verifier... so cs sesa ee 
8.2.1 Complexity of the introspective verifier . . . . ooo 
8.2.2 Introspection theorem . . . ooa eee ee 
8.3 Completeness of the introspective verifier .............. 0.000000 0000. 
8.3.1 Preliminary lemmas ........... 0.0.0.0 0 eee ee 
8.3.2 Completeness of the introspective verifier . . . . o.oo 
8.4 Soundness of the introspective verifier... .. oaoa 00002 eee eee ee 
SA. “The Pauli twirl, -mas i ia bak ae SR ee ee ee ee EP eho 
8.4.2 Preliminary lemmas ........... 0.000. eee ee 
8.4.3 ~-SOuNdNESS PrOOl ©. 6.408 dod ea ee eee Re ee we a ee ee ae Oe a 
Oracularization 
9T: “OVEIVIEW. 6 a4 6a RO Ra ee baka Ree ET OE A ee Eo e bee 
9.2 Oracularizing normal form verifiers .. 2... 2... 2.0.0.0 0 000 2b ee eee ee 
9.3 Completeness and complexity of the oracularized verifier... ..............0.. 
9.4 Soundness of the oracularized verifier ........... 0... 000000002 e eee 


Answer Reduction 
10.1 Circuit preliminaries . . . 2... ee 
10.2 A Cook-Levin theorem for bounded deciders ...........0. a beeen 
10.3 A succinct SSAT description for deciders ......... 20... 0000000000000. 
10.4 A PCP for normal form deciders .. 1... 2... a 
10.4.1 Preliminaries .. 2... a 
104.2 "The PCP no os 4 ea de de we he ce @ ae A hi we we a be ae es Wk A S 
10.5 A normal form verifier forthe PCP... ..... 0.0.0.0. 0000 eee ee ees 
10.5.1 Parameters and notation .. 2... 0... ee 
10.5.2 The answer-reduced verifier .. 2... a a a a e a a 
10.5.3 Main theorem for answer reduction . . . . ooo a 
10.6 Completeness of the answer-reduced verifier . . . . o.oo a 
10.7 Soundness of the answer-reduced verifier . . . o.oo a a a a e a a 


Parallel Repetition 
11.1 The anchoring transformation... . oaoa ee 
11.2 The parallel repetition transformation .............. 0.000002. eee eee 


Gap-Preserving Compression 

12:1. Proof of Theorem 120 sca aai ens a i i ee ee we Sa a e ae eA ee oe ates 
12.2 An MIP* protocol for the Halting problem... . 2.0... 0. ee 
12.3 An explicit separation. . . . . ooa ee 


Analysis of the Pauli basis test 

AT (Preliminaries - ce sw ek Be a ee ee ee ee nde ek ee he we a 
A2 SIFAleSIES «6x eee ee daa PA EG be ee ee ek bee hw we 
A.3 Expanding the Hilbert space and defining commuting observables .............. 


A.4 Combining the X and Z measurements 
A.5 Applying the classical low-degree test 


A.6 Pulling the X and Z measurements apart... 2... .........0 000. eee ee eee 


1 Introduction 


For integer n,k > 2 define the quantum (spatial) correlation set Cys (n, k) as the subset of RF that 
contains all tuples ( Pabxy) representing families of bipartite distributions that can be locally generated in 
non-relativistic quantum mechanics. Formally, (Papxy) € Cqs(n,k) if and only if there exist separable 
Hilbert spaces Ha and Hp, for every x € {1,...,n} (resp. y € {1,...,n}), a collection of projec- 
tions {AZ }ac{1,..,k} On Ha (resp. {By} nef1,..,k} on Hp) that sum to identity, and a state (unit vector) 
y € Ha ® Hp such that 


Vx,y € {1,2,...,n}, Ya,b € {1,2,...,k},  Pavxy = Y* (Al Q Bip. (1) 


Note that due to the normalization conditions on p and on {A*} and {Bj}, for each x,y, (Pabxy) is a 
probability distribution on {1,2,... ik, By taking direct sums it is easy to see that the set C4s(n, k) is 
convex. Let Cya(,k) denote its closure (it is known that Cs (n, k) A Cqa(n,k), see [Slo19a]). 

Our main result is that the family of sets {Cqa(n, k) }n,ken is extraordinarily complex, in the following 
computational sense. For any 0 < e < 1 define the e-weak membership problem for Cga as the problem of 
deciding, given n,k € N and a point p = (Papxy) € IR”, whether p lies in Cya(n,k) or is e-far from 
it in 4 distance, promised that one is the case. Then we show that for any given 0 < € < 1 the e-weak 
membership problem for Cy, cannot be solved by a Turing machine that halts with the correct answer on 
every input. 

We show this by directly reducing the Halting problem to the weak membership problem for Cga: we 
show that for all 0 < e < 1 and any Turing machine M one can efficiently compute integers n,k € IN and 
a linear functional £4 on IR™ such that, whenever M halts it holds that 


sup |éu(p)| = 1, (2) 
pECga(n,k) 
whereas if M does not halt then 
sup |ém(p)| < 1-e. (3) 
p€Cqa(n,k) 


By standard results in convex optimization, this implies the aforementioned claim on the undecidability of 
the e-weak membership problem for Cga (for any 0 < e < 1). 

Our result has interesting consequences for long-standing conjectures in quantum information theory and 
the theory of von Neumann algebras. Through a connection that follows from the work of Navascues, Piro- 
nio, and Acin [NPAO8] the undecidability result implies a negative answer to Tsirelson’s problem [Tsi06]. 
Let C,-(,k) denote the set of quantum commuting correlations, which is the set of tuples (Papxy) arising 


from operators {A} and {B/} acting on a single Hilbert space H and a state Y € H such that 
Vey E€ {L... n}, VYa,b E€ {1,...,k}, Parry =P" (ABI) and [Az, BY] =0. @ 
Then Tsirelson’s problem asks if, for all n,k, the sets Cgq(n,k) and Cyc(n,k) are equal. Using results 
from [NPA08] we give integer n, k and an explicit linear function £ on R” such that 
1 


sup |f(p)| =1, but sup |é(p)| < 5, 
pECge(n,k) p€Cga(n,k) 


which implies that Cgq(n,k) Æ Cgc(n,k). By an implication of Fritz [Fril2] and Junge et al. [JNP*11] 
we further obtain that Connes’ Embedding Conjecture [Con76] is false; in other words, there exist type II, 


von Neumann factors that do not embed in an ultrapower of the hyperfinite II; factor. We explain these 
connections in more detail in Section 1.3 below. 

Our approach to constructing such linear functionals on correlation sets goes through the theory of inter- 
active proofs from complexity theory. To explain this connection we first review the concept of interactive 
proofs. The reader familiar with interactive proofs may skip the next section to arrive directly at a formal 
statement of our main complexity-theoretic result in Section 1.2. 


1.1 Interactive proof systems 


An interactive proof system is an abstraction that generalizes the familiar notion of proof. Intuitively, given 
a formal statement z (for example, “this graph admits a proper 3-coloring”), a proof 7c for z is information 
that enables one to check the validity of z more efficiently than without access to the proof (in this example, 
7t could be an explicit assignment of colors to each vertex of the graph). 

Complexity theory formalizes the notion of proof in a way that emphasizes the role played by the veri- 
fication procedure. To explain this, first recall that in complexity theory a language L is a subset of {0,1}*, 
the set of all bit strings of any length, that intuitively represents all problem instances to which the answer 
should be “yes”. For example, the language L = 3-COLORING contains all strings z such that z is the 
description (according to some pre-specified encoding scheme) of a 3-colorable graph G. We say that a 
language L admits efficiently verifiable proofs if there exists an algorithm V (formally, a polynomial-time 
Turing machine) that satisfies the following two properties: (i) for any z € L there is a string 7t such that 
V (z, 7r) returns 1 (we say that V “accepts”), and (ii) for any z ¢ L there is no string 7t such that V (z, 7r) 
accepts. Property (i) is generally referred to as the completeness property, and (ii) is the soundness. The 
set of all languages L with both these completeness and soundness properties is denoted by the complexity 
class NP. 

Research in complexity and cryptography in the 1980s and 1990s led to a significant generalization 
of the notion of “efficiently verifiable proof”. The first modification is to allow randomized verification 
procedures by relaxing (i) and (ii) to high probability statements: every z € L should have a proof 7t that is 
accepted with probability at least c (the completeness parameter), and for no z ¢ L should there be a proof 
7t that is accepted with probability larger than s (the soundness parameter). A common setting is to take 
c= $ and s = 3 standard amplification techniques reveal that the exact values do not significantly affect 
the class of languages that admit such proofs, provided that they are chosen within reasonable bounds. 

The second modification is to allow interactive verification. Informally, this means that instead of re- 
ceiving a proof string 7 in its entirety and making a decision based on it, the verification algorithm (called 
the “verifier”) instead communicates with another algorithm called a “prover”, and based on the communi- 
cation decides whether z € L. There are no restrictions on the computational power of the prover, whereas 
the verifier is required to run in polynomial time.! 

To understand how randomization and interaction can help for proof checking, consider the following 
example of an interactive proof for the language GRAPH NON-ISOMORPHISM, which contains all pairs of 
graphs (Go, G1) such that Go and G4 are not isomorphic.” It is not known if GRAPH NON-ISOMORPHISM € 
NP, because it is not clear how to give an efficiently verifiable proof string that two graphs Go and G4 are 


'The reader may find the following mental model useful: in an interactive proof, an all-powerful prover is trying to convince 
a skeptical, but computationally limited, verifier that a string z (known to both) lies in the set L, even when it may be that in fact 
z ¢ L. By interactively interrogating the prover, the verifier can reject false claims, i.e. determine with high statistical confidence 
whether z € L or not. Importantly, the verifier is allowed to probabilistically and adaptively choose its messages to the prover. 

2Here and in the rest of the section, we implicitly assume that graphs and tuples of graphs have a canonical encoding as binary 
strings. 


not isomorphic. (A proof of isomorphism is, of course, trivial: given a bijection from the vertices of Go to 
those of G4 it is straightforward to verify that the bijection induces an isomorphism.) However, consider 
the following randomized, interactive verification procedure. Suppose the input to the verifier and prover 
is a pair of n-vertex graphs (Go, G1) (if the graphs do not have the same number of vertices, they are 
trivially non-isomorphic and the verification procedure can automatically accept). The verifier first selects 
a uniformly random b € {0,1} and a uniformly random permutation © of {1,...,1} and sends the graph 
H = o(G,) to the prover. The prover is then supposed to respond with a bit b’ € {0,1}; if b' = b the 
verifier accepts and if b’ Æ b it rejects. 

Clearly, if Go and G4 are not isomorphic then there exists a prover strategy to compute b from H with 
probability 1: using its unlimited computational power, the prover can determine whether H is isomorphic 
to Go or to Gy. However, if Go and G4 are isomorphic then the distribution of H is uniform over the 
isomorphism class of Go, which is the same as the isomorphism class of G4, and the prover (despite having 
unlimited computational power) cannot distinguish between whether the verifier generated H using Go or 
G,. Thus the probability that any prover can correctly guess b’ = b is exactly Z. As a result, we have 
shown that the graph non-isomorphism problem has an interactive proof system with completeness c = 1 
and soundness s = Z. Note how little “information” is communicated by the prover: a single bit! The 
extreme succinctness of the “proof” comes from the fact that whether Go is isomorphic to G; determines 
whether a prover can reliably compute, given the data available to it (which is Go,G;, and H), the correct 
bit b. 

We denote by IP the class of languages that admit randomized interactive proof systems such as the one 
just described. The class IP is easily seen to contain NP, but it is thought to be a much larger class: one 
of the famous results of complexity theory is that IP is exactly the same as PSPACE [LFKN90, Sha90], the 
class of languages decidable by Turing machines using polynomial space.’ Thus a polynomial-time verifier, 
when augmented with the ability to interrogate an all-powerful prover and use randomization, can solve 
computational problems that are (believed to be) vastly more difficult than those that can be checked using 
static, deterministic proofs (i.e. NP problems). 


Multiprover interactive proofs. We now discuss a generalization of interactive proofs called multiprover 
interactive proofs. Here, a polynomial-time verifier can interact with two (or more) provers to decide 
whether a given instance z is in a language L or not. In this setting, after the verifier and all the provers 
receive the common input z, the provers are not allowed to communicate with each other, and the verifier 
“cross-interrogates” the provers in order to decide if z € L. The provers may coordinate a joint strategy 
ahead of time, but once the protocol begins the provers can only interact with the verifier. As we will see, 
the extra condition that the provers cannot communicate with each other is a powerful constraint that can be 
leveraged by the verifier. 

Consider the computational problem of deciding membership in a promise language called GAP-MAXCUT. 
A promise language L is specified by two disjoint subsets Lyes, Lio © 10: 1}*, and the task is to decide 
whether a given instance z is in Lyes or Lino, promised that z € Lyes U Lno. In a proof system for a promise 
language, the completeness case consists of accepting with probability at least c if z € Lyes, and the sound- 
ness case consists of accepting with probability at most s if z € Ly. If z ¢ Lyes U Lpo, then there are no 
constraints on the behavior of the verifier. 

The promise language GAP-MAXCUT is defined as follows: GAP-MAXCUT yes (resp. GAP-MAXCUT no) 


3The reason PSPACE is considered a “difficult” class of problems is because many computational problems believed to require 
super-polynomial or exponential time (such as 3-COLORING or deciding whether a quantified Boolean formula is true) can be 
solved using a polynomial amount of space. 


is the set of all graphs G with a cut (i.e. a bipartition of the vertices) such that at least 90% of edges cross 
the cut (resp. at most 60% of edges cross the cut). For simplicity, we also assume that all graphs in 
GAP-MAXCUT yes U GAP-MAXCUT 19 are regular, i.e. the degree is a constant across all vertices in the 
graph. 

The GAP-MAXCUT problem clearly lies in NP, since given a candidate cut it is easy to count the number 
of edges that cross it and verify that it is at least 90% of the total number of edges. Observe that the length 
of the proof and the time required to verify it are linear in the size of the graph (the number of vertices and 
edges). Finding the proof is of course much harder, but we are only concerned with the complexity of the 
verification procedure. 

Now consider the following simple two-prover interactive proof system for GAP-MAXCUT. Given a 
graph G, the verification procedure first samples a uniformly random edge e = {u,v} in G. It then sends a 
uniformly random x € {u,v} to the first prover, and a uniformly random y € {u,v} to the second prover. 
Each prover sees its respective question only and is expected to respond with a single bit, a,b € {0,1} 
respectively. The verification procedure accepts if and only if a = b if x = y, anda A b if x Æ y. 

We claim that the verification procedure described in the preceding paragraph is a multiprover interactive 
proof system for the language GAP-MAXCUT, with completeness c = 0.95 and soundness s = 0.9, in the 
following sense. First, whenever G € GAP-MAXCUTyes then there is a successful strategy for the provers: 
specifically, the provers can fix an optimal bipartition and consistently answer “0” when asked about a vertex 
from one side of the partition, and “1” when asked about a vertex from the other side; assuming there exists 
a cut that is crossed by at least 90% of the edges, this strategy succeeds with probability at least 1 + 10.9, 
where the first factor } arises from the case when both provers are sent the same vertex, in which case they 
always succeed. 

Conversely, suppose given a strategy for the provers that is accepted with probability p = 5 + 5(1 — 6) 
when the verification procedure is executed on a (regular) n-vertex graph G. We then claim that G has a 
cut crossed by at least a 1 — 26 fraction of all edges. To show this, we leverage the non-communication 
assumption on the provers. Since either prover’s question is always a single vertex, their strategy can be rep- 
resented by a function from the vertices of G to answers in {0,1}. Any such function specifies a bipartition 
of G. While the provers’ bipartitions need not be identical, the fact that they succeed with high probability, 
for the case when they are sent the same vertex, implies that they must be consistent with high probability. 
Finally, the fact that they also succeed with high probability when sent opposite endpoints of a randomly 
chosen edge implies that either prover’s bipartition must be cut by a large number of edges. Taking the 
contrapositive establishes the soundness property. 

We denote by MIP the class of languages that have multiprover interactive proof systems such as the 
one described in the preceding paragraph. Note that, in comparison to the NP verification procedure for 
GAP-MAXCUT considered earlier, the interactive, two-prover verification is much more efficient in terms 
of the effort required for the verifier. Assuming the graph is provided in a convenient format,’ it is possible 
to sample a random edge and verify the provers’ answers in time and space that scales logarithmically with 
the size of the graph. This exponential improvement in the efficiency of the verification procedure serves 
as the starting point for another celebrated result from complexity theory: MIP is exactly the same as the 
class NEXP [BFL91], which are problems that admit exponential-time checkable proofs.° The class NEXP 


“The specific numbers 90% and 60% are not too important; the only thing that really matters is that the first one is strictly less 
than 100% and the second strictly larger than 50%, as otherwise the problem becomes much easier. 

For example, the graph can be specified via a circuit that takes as input an edge index — using some arbitrary ordering — and 
returns labels for the two endpoints of the edge. 

6An example of such a problem is the language SUCCINCT-3-COLORING, which contains descriptions of polynomial-size 
circuits C that specify a 3-colorable graph Gc on exponentially many vertices. 


contains PSPACE, but is believed to be much larger; this suggests that the ability to interrogate more than 
one prover enables a polynomial-time verifier to verify much more complex statements. 


Nonlocal games. In this paper we will only be concerned with multiprover interactive proof systems 
that consist of a single round of communication with two provers: the verifier first sends its questions to 
each of the provers, the provers respond with their answers, and the verifier decides whether to accept or 
reject. The class of problems that admit such interactive proofs is denoted MIP(2,1), and it is known that 
MIP = MIP(2,1) [FL92]. Such proof systems have a convenient reformulation using the language of 
nonlocal games, that we now explain. 

In a nonlocal game, we say that a verifier interacts with multiple non-communicating players (instead 
of provers — there is no formal difference between the two terms). An n-question, k-answer nonlocal 
game 6 is specified by two procedures: a question sampling procedure that samples a pair of questions 
(x,y) € {1,...,n}? for the players according to a distribution u (known to the verifier and the players), and 
a decision procedure that takes as input the players’ questions and their respective answers a,b € {1,...,k} 
and evaluates a predicate D(x,y,a,b) € {0,1} to determine the verifier’s acceptance or rejection. In 
classical complexity theory, the main quantity associated with a nonlocal game © is its classical value, 
which is the maximum success probability that two cooperating but non-communicating players have in the 
game. Formally, the classical value is defined as 


val(6) = sup } p(x,y) ) D(x, y,4,b) pabny + (5) 
peC.(n,k) x,y a,b 


where the set C,(n,k) is the set of classical correlations, which are tuples (Papxy) such that there exists a 
set A with probability measure v and for every A € A functions Aĉ, BÀ : {1,2,...,n} > {1,2,...,k} 
such that 


Vx, y € {1,2,... n}, Va,be {1,2,...,k}, Pavey = Pr (A*(x) = a A BA(y) = b). 


This definition captures the intuitive notion that a classical strategy for the players is specified by (i) a 
distribution v on A that represents some probabilistic information shared by the players that is independent 
of the verifier’s questions, and (ii) two functions A^, BÀ that represent each players’ “local strategy” for 
answering given their shared randomness À and question x or y. 7 Note that due to the shared randomness 
A, the set C,(1,k) is a (closed) convex subset of [0,1]’"*. 

To make the connection with interactive proof systems, observe that the assertion that L € MIP(2,1) 
precisely amounts to the specification of an efficient mapping? from problem instances z to games 6, such 
that whenever z € L then val(6,) > Z, whereas if z ¢ L then val(6,) < Z. Thus the complexity 
of the optimization problem (5) captures the complexity of the decision problem L. The aforementioned 
characterization of MIP as the class NEXP by [BFL91] shows that in general this optimization problem is 
very difficult: it is as hard as deciding any language in NEXP. 


7For the functional analyst we briefly note that if we define a tensor 


L= )° n(x,y)D(x,y,4,b)exa Q eyp E R” @R* 
x,y,a,b 


then val(6) = ||L]| p»(¢,,)ke#(¢,,)¢* With Qe denoting the injective tensor norm of Banach spaces. (For more connections between 
interactive proofs, nonlocal games and tensor norms we refer to the survey [PV 16].) 

Here by “efficient” we mean that there should be a polynomial-time Turing machine that on input z returns (i) a polynomial-size 
randomized circuit that samples from 4, and (ii) a polynomial-size circuit that evaluates the predicate D. 


1.2 Statement of result 


We now introduce the main complexity class that is the focus of this paper: MIP*, the “entangled-prover” 
analogue of the class MIP considered earlier. Informally the class MIP*, first introduced in [CHTW04], 
contains all languages that can be decided by a classical polynomial-time verifier interacting with multiple 
quantum provers sharing entanglement. We focus on the class MIP* (2,1), which corresponds to the setting 
of one-round protocols with two provers. Equivalently, a language L is in MIP* (2, 1) if and only if there is an 
efficient mapping from instances z € {0,1}* to nonlocal games 6, such that if z € L, then val*(6,) > 2/3 
and otherwise val"(6,) < 1/3. Here, for an n-question, k-answer game 6, we let val* (6) denote its 
entangled value, which is defined as 


val" (6) = sup X u(x, y) P D(x, y, a, b) Pabxy y (6) 
pECy (n,k) xy a,b 


where the set C,(n,k) is the set of all finite-dimensional quantum correlations, i.e. correlations of the 
form (1) where H is restricted to be finite-dimensional. Although the sets C,(1,k) and C,.(,k) in general 
are distinct [CS18], it is an easy exercise to verify that they have the same closure Cya(n, k), and there- 
fore the supremum in (6) can equivalently be taken over C,(n,k), Cgs(n,k) or Cga(n,k). We use C, for 
convenience in the analysis. 

Since C,(n,k) C C,(n,k) for all n, k, we have that val(6) < val* (6); in other words, quantum spatial 
strategies can perform at least as well as classical strategies in a nonlocal game. 

The consideration of quantum strategies and the set C,;(1,k) for the definition of MIP* is motivated by 
a long line of works in the foundations of quantum mechanics around the topic of Bell inequalities, that are 
linear functionals which separate the sets C,(n,k) and C,.(1,k). The simplest such functional is the CHSH 
inequality [CHSH69], that shows C,(2,2) © Cjs(2,2). The CHSH inequality can be reformulated as a 
game © such that val*(G) > val(G). This game is very simple: it is defined by setting (x,y) = $ for 
all x,y € {0,1} and D(x,y,a,b) = 1 if and only ifa@ b = x A y. It can be shown that val(6) = Ẹ and 
val* (6) = 5 + aa > 3, The study of Bell inequalities is a large area of research not only in foundations, 
where they are a tool to study the nonlocal properties of entanglement, but also in quantum cryptography, 
where they form the basis for cryptographic protocols for e.g. quantum key distribution [Eke91]. 

The introduction of entanglement in the setting of interactive proofs has interesting consequences for 
complexity theory; indeed it is not a priori clear how the class MIP* compares to MIP. Take a language 
L € MIP(2,1), and let z be an instance. Then the associated game 6, is such that val(6,) > 3 ifz E€ L, 
and val(6,) < $ otherwise. The fact that in general val*(6,) > val(6z) (and that as demonstrated by 
the CHSH game inequality can be strict) cuts both ways. On the one hand, the soundness property can be 
affected, so that instances z ¢ L could have val* (Gz) = 1, meaning that we would not be able to establish 
that L € MIP*. On the other hand, a language L € MIP*(2,1) may not necessarily be in MIP, because for 
z € L the fact that val* (67) > å does not automatically imply val(,) > 4 (in other words, the game 6, 
may require the players to use a quantum strategy in order to win with probability greater than 1/3). Just 
as the complexity of the class MIP is characterized by the complexity of approximating the classical value 
of nonlocal games (the optimization problem in (5)), the complexity of MIP* is intimately related to the 
complexity of approximating the entangled value of games (the optimization problem in (6)). 

In [IV 12] the first non-trivial lower bound on MIP* was shown, establishing that MIP = NEXP C MIP*. 
(Earlier results [KKM* 11, IKMO09] gave more limited hardness results, for approximating the entangled 
value up to inverse polynomial precision.) This was proved by arguing that for the specific games con- 
structed by [BFL91] that show NEXP C MIP, the classical and entangled values are approximately the 


10 


same. In other words, the classical soundness and completeness properties of the proof system of [BFL91] 
are maintained in the presence of shared entanglement between the provers. Following [IV 12] a sequence of 
works [Vid16, Jil6, NV18b, Jil7, NV18a, FIVY19] established progressively stronger lower bounds on the 
complexity of approximating the entangled value of nonlocal games, culminating in [NW19] which showed 
that approximating the entangled value is at least as hard as NEEXP, the collection of languages decidable 
in non-deterministic doubly exponential time. This proves that NEEXP C MIP*, and since it is known that 
NEXP Ç NEEXP it follows that MIP Æ MIP*. 

In contrast to these increasingly strong lower bounds the only upper bound known on MIP* is the trivial 
inclusion MIP* C RE, the class of recursively enumerable languages, i.e. languages L such that there 
exists a Turing machine M such that x € L if and only if M halts and accepts on input x. This inclusion 
follows since the supremum in (6) can be approximated from below by performing an exhaustive search in 
increasing dimension and with increasing accuracy. We note that, in addition to containing all decidable 
languages, this class also contains undecidable problems such as the Halting problem, which is to decide 
whether a given Turing machine eventually halts. 

Our main result is a proof of the reverse inclusion: RE C MIP*. Combined with the preceding observa- 
tion it follows that 

MIP* = RE, 


which is a full characterization of the power of entangled-prover interactive proofs. In particular for any 
0 < e < 1, it is an undecidable problem to determine whether a given nonlocal game has entangled value 1 
or at most 1 — e (promised that one is the case). 


Proof summary. The proof of the inclusion RE C MIP* is obtained by designing an entangled-prover 
interactive proof for the Halting problem, which is complete for the class RE. Specifically, we design an 
efficient transformation that maps any Turing machine M to a nonlocal game 6,y such that, if M halts 
(when run on an empty input tape) then there is a quantum strategy for the provers that succeeds with 
probability 1 in 6m (i.e. val* (6m) = 1), whereas if M does not halt then no quantum strategy can 
succeed with probability larger than 5 in the game (i.e. val*(Gyy) < $). Furthermore, the game 6,4 
has the property that whenever val*(G,,) = 1 then this fact is witnessed by a synchronous strategy, i.e. 
a strategy where the players always give the same answer when simultaneously asked the same question. 
Synchronous strategies, or correlations, were first introduced in [PSS*16] and have played an important 
role in approaches to CEP based on quantum information and the study of nonlocal games; see e.g. [KPS18] 
and references therein. In the paper we use a terminology of projective, consistent and commuting (PCC) 
strategies (see Definition 5.11 in Section 5.1), a notion which implies the notion of being synchronous. 

A very rough sketch of this construction is as follows (we give a detailed overview in Section 2). Given 
an infinite family of games {6,,} cn, we say that the family is uniformly generated if there is a polynomial- 
time Turing machine that on input n returns a description of the game 6,,. Given a game 6 and p € [0,1] 
let &(6, p) denote the minimum local dimension of an entangled state shared by the players in order for 
them to succeed in 6 with probability at least p. 

We proceed in two steps. First, we design a compression procedure for a specific class of nonlocal 
games that we call normal form. Given as input a uniformly generated family {6,},cn of normal form 
games, the compression procedure returns another uniformly generated family {6/,},cn of normal form 
games with the following properties: (i) for all n, if val*(@2:) = 1 then val* (G!,) = 1, and (ii) for all n, if 
val“ (Gon) < 5 then val*(6/,) < 5 and moreover 


54,5) > max {6(6x,5), gn} . 


11 


The construction of this compression procedure is our main contribution. Informally, it combines the 
recursive compression technique developed in [Jil7, FJVY19] with the so-called “introspection” technique 
of [NW19] that was used to prove NEEXP C MIP*. The introspection technique itself relies heavily on 
the quantum low individual degree test of [NV18a, JNV 20] to robustly self-test certain distributions that 
arise from constructions of classical probabilistically checkable proofs. The quantum low-degree test and 
the introspection technique allow us to avoid the shrinking gap limitation of the results from [FJV Y 19]. 

In the second step, we use the compression procedure in an iterated fashion to construct an interactive 
proof system for the Halting problem. Fix a Turing machine M and consider the following family of 


nonlocal games {6 „tnem: for all n € IN, if M halts in at most n steps (when run on an empty input 


tape), then val” (6) ,) = 1, and otherwise val* (6) aoe 
Constructing such a family of games is trivial; furthermore, they can be made in the “normal form” 
required by the compression procedure. However, consider applying the compression procedure to obtain a 


family of normal form games {ol atnen. Then for all n € N, it holds that if M halts in at most 2” steps 
then val* GA = 1, and otherwise val* (60) < 4, and furthermore any strategy that achieves a value 


of at least 5 requires an entangled state of dimension at least 22M 

Intuitively, one would expect that iterating this procedure and “taking the limit” gives a family of games 
TIVA nen such that if M halts then val“ (6? ) = 1 forall n € N, whereas if M does not halt then no 
(co) 


finite-dimensional strategy can succeed with probability larger than 5 in ÕM 


for all n € IN; in particular 
val* (6) ) < Z. Formally, we do not take such a limit but instead define directly the family of games 
TAVA taen as a fixed point of the Turing machine that implements the compression procedure. The game 


(co) 


6 m can then be taken as 6544. We describe this in more detail in Section 2. 


1.3 Consequences 


Our result is motivated by a connection with Tsirelson’s problem from quantum information theory, itself 
related to Connes’ Embedding Conjecture in the theory of von Neumann algebras [Con76]. In a celebrated 
sequence of papers, Tsirelson [Tsi93] initiated the systematic study of quantum correlation sets. Recall the 
definition of the set of quantum spatial correlations 


Cos (11, k) = {(Pabxy) | Pabxy = (P143 ® Bily), |p) € Ha ® He, Vxy, {Az }a, {By}y POVM} , (7) 


where here |y} ranges over all unit norm vectors |) € Ha ® Hg with Ha and Hp arbitrary (separable) 
Hilbert spaces, and a POVM is defined as a collection of positive semidefinite operators that sum to identity. 
(From now on we use the Dirac ket notation |") for states.) Recall the closure Cga (n, k) of Cys(1,k). 

Tsirelson observed that there is a natural alternative definition to the quantum spatial correlation set, 
called the quantum commuting correlation set and defined as 


Coc(n, k) = { (Pabxy) | Pabxy = (p| Az B lp) } , (8) 


where |p) € H is a quantum state, {A*} and {B/} are POVMs for all x,y, and [A*, B/] = 0 for all 
a,b, x,y. Note the key difference with spatial correlations is that in (8) all operators act on the same (separa- 
ble) Hilbert space. The requirement that operators associated with different inputs (questions) x, y commute 


°There is nothing special about the choice of i; this can be set to any constant that is less than 1. 


12 


is arguably a minimal requirement within the context of quantum mechanics for there to not exist any causal 
connection between outputs (answers) a, b obtained in response to the respective input. 

The set C,-(n,k) is closed and convex, and it is easy to see that Cyq(n,k) C Cyc(n,k) for all n,k > 1. 
When Tsirelson initially introduced these sets he claimed that equality holds. However, it was later pointed 
out that this is not obviously true. The question of equality between C,- and Cg, (for all n, k) is now known 
as Tsirelson’s problem [Tsi06]. Let C,(n,k) denote the same as C}s(n, k) except that both Ha and Hg 
in (7) are restricted to finite-dimensional spaces. Then more generally one can consider the following chain 
of inclusions 

C,(n,k) © Cgs(n,k) C Cga(n,k) C Cge(n,k) , (9) 


for all n,k € N, and ask which (if any) of these inclusions are strict. We let Cg, Cas, Cga, Cgc denote the 
union of C,(1,k), Cas (n, k), Cga (n, k), Cgc(n, k), respectively, over all integers n,k € IN. Ina breakthrough 
work, Slofstra established the first separation between these four correlation sets by proving that Cj; # 
Cyc [Slo19b]; he later proved the stronger statement that Cys Æ Cga [Slo19a]. As a consequence of the 
technique used to demonstrate the separation Slofstra also obtains the complexity-theoretic statement that 
the problem of determining whether an element p lies in C,-, even promised that if it does, then it also lies in 
Cga, is undecidable. Interestingly, this is shown by reduction from the complement of the halting problem; 
for our result we reduce from the halting problem (see Section 1.4 for further discussion of this point). 
Since his work, simpler proofs of Slofstra’s results have been found [DPP19, MR18, Coll19]. In [CS18], 
Coladangelo and Stark showed that C; # Cys by exhibiting a 5-input, 3-output correlation that can be 
attained using infinite-dimensional spatial strategies (i.e. infinite-dimensional Hilbert spaces, a state and 
POVMsSs satisfying (7)) but cannot be attained via finite-dimensional strategies. 

As already noted in [FNT14] (and further elaborated on by [FJVY19]), the undecidability of MIP* (2,1) 
implies the separation Cg, = Cae This follows from the observation that if Cgq = Cgc, then there exists 
an algorithm that can correctly determine if a nonlocal game © satisfies val*(6) = 1 or val* (6) < 5 and 
always halts: this algorithm interleaves a hierarchy of semidefinite programs providing outer approximations 
to the set Cgc [NPA08, DLT W08] with a simple exhaustive search procedure providing inner approximations 
to C,. Our result that RE C MIP*(2,1) implies that no such algorithm exists, thus resolving Tsirelson’s 
problem in the negative. 

We furthermore exhibit an explicit nonlocal game © such that val*(6) < val®(6) = 1, where 
val (6) is defined as val* (6) except that the supremum is taken over the set C,-(m,k) in (8). This in 
turn yields an explicit correlation that is in the set C,- but not in Cga. This game closely resembles the game 
6 m described in the sketch of the proof that RE C MIP*, where M is the Turing machine that runs the 
hierarchy of semidefinite programs on the game 6 yy and halts if it certifies that val (6 q) < 1. It is in 
principle possible to determine an upper bound on the parameters n, k for our separating correlation from the 
proof. While we do not provide such a bound, there is no step in the proof that requires it to be astronomical; 
e.g. we believe (without proof) that 107° is a clear upper bound. 


Connes’ Embedding Conjecture. Connes’ Embedding Conjecture (CEC) [Con76] is a conjecture in the 
theory of von Neumann algebras. Briefly, CEC posits that every type II; von Neumann factor embeds into 
an ultrapower of the hyperfinite II, factor. We refer to [Oza13] for a precise formulation of the conjecture 
and connections to other conjectures in operator algebras, such as Kirchberg’s QWEP conjecture. In inde- 
pendent work Fritz [Fril2] and Junge et al. [JNP* 11] showed that a positive answer to CEC would imply 


!0Technically [FNT14] make the observation for the commuting-prover analogue MIP°° (2,1), discussed further in Section 1.4, 
but the reasoning is the same. 


13 


a positive resolution of Tsirelson’s problem, i.e. that Cyq(n,k) = Cgc(n,k) for all n,k. (This was later 
promoted to an equivalence by Ozawa [Oza13].) Since our result disproves this equality for some n, k it also 
implies that CEC does not hold. In work that appeared subsequently to the first announcement of our results, 
Goldbring and Hart [GH16, GH20] show using arguments from continuous logic that the uncomputability 
of approximating val" (6) refutes the CEC. Interestingly, their argument uses general elementary consider- 
ations from logic and completely bypasses Tsirelson’s problem and its equivalence with Kirchberg’s QWEP 
conjecture. We note that using the constructive aspect of our result it may be possible to give an explicit 
description of a factor that does not embed into an ultrapower of the hyperfinite I, factor, but we do not 
give such a construction. 


Entanglement tests. As a step towards showing our result for any integer n > 1 we construct a game 
6,, with question and answer length polynomial in the size of the smallest Turing machine M, that halts 
(on the empty tape) in exactly n steps (i.e. the Kolmogorov complexity of n), such that val*(6,) = 1 yet 
any quantum strategy that succeeds in 6, with probability larger than 1 must use an entangled state whose 
Schmidt rank is at least 2°”), This is by far the most efficient entanglement test that we are aware of. 


Prover and round reduction for MIP* protocols. Let MIP*(k,r) denote the collection of languages 
decidable by MIP* protocols with k > 2 provers and r rounds. Prior to our work it was known how to 
perform round reduction for MIP* protocols, at the cost of adding provers; it was shown by [Ji17, FJVY19] 
that MIP*(k,r) C MIP*(k + 15,1) for all k,r. However, it was an open question whether the complexity 
of the class MIP* increases if we add more provers. Our main complexity-theoretic result implies that 
MIP* = MIP*(2,1). This follows from the following chain of inclusions: for all polynomially-bounded 
functions k,r, 
MIP*(2,1) C MIP*(k,r) C RE C MIP*(2,1). 


The first inclusion follows since the verifier in an MIP* protocol can always ignore extra provers and rounds; 
the second inclusion follows from a simple exhaustive-search procedure that enumerates over strategies for 
a given MIP* (k, r) protocol; the third result is proven in this paper.'! 

However, this method of reducing provers and rounds in a given MIP* protocol is indirect; it involves 
first converting a given MIP* protocol into a Turing machine that accepts if and only if the MIP* protocol 
has value larger than 7 and then constructing an MIP*(2,1) protocol to decide whether the Turing machine 
halts. In particular this transformation does not generally preserve the complexity of the provers and verifier 
in the original protocols. We leave it as an open question to find a more direct method for reducing the 
number of provers in an MIP* protocol. 


1.4 Open questions 


We mention several questions left open by our work. 


Explicit constructions of counter-examples to Connes’ Embedding Conjecture. We provide an ex- 
plicit counter-example to Tsirelson’s problem in the form of a game whose entangled value differs from 
its commuting-operator value. Through the aforementioned connection with Connes’ embedding conjec- 
ture [Fril2, JNP*11, Oza13], the counter-example may lead to the construction of interesting objects in 


lIn fact, we note that the second term MIP* (k, r) can be replaced by QMIP* (k, r), which is the analogous class with a quantum 
verifier and quantum messages, since the first inclusion is trivial and the second remains true. As a result, we obtain that QMIP* = 
MIP* (2,1) as well. 


14 


other areas of mathematics. A first question is whether it can lead to an explicit description of a type Il 
factor that does not satisfy the Connes embedding property. Such a construction could be obtained along 
the lines of [KPS18], using the fact that our game 6 such that val* (6) < val® (6) = 1 has the property 
of being synchronous, i.e. perfect strategies in the game are required to return the same answer when both 
parties are provided the same question. 

Going further, one may ask if the example can eventually lead to a construction of a group that is not 
sofic, or even not hyperlinear (see e.g. [CLP15] for the connection). 


The complexity of variants of MIP*. Our result characterizes the complexity class MIP* as the set of 
recursively enumerable languages. One can also consider the complexity class MIP, which stands for 
multiprover interactive proofs in the commuting-operator model. For the sake of the discussion we consider 
only two-prover one-round protocols; a language L is in MIP°°(2,1) if there exists an efficient reduction 
that maps z € {0,1}* to a nonlocal game ©, such that if z € L then val°°(G,) > 4, and otherwise 
val(6,) < Z. 

The semidefinite programming hierarchy of [NPA08, DLTW08] can be used to show that MIP°°(2, 1) 
is contained in the complement of RE, denoted as coRE: to certify that z ¢ L it suffices to run the hierarchy 
until it obtains a certificate that val(6,) < Z, Since it is known that RE Æ coRE,!? this implies that 
MIP*(2,1) A MIP® (2,1). 

It is thus plausible that MIP® = coRE,!? which would provide a very pleasing “dual” complexity 
characterization to MIP* = RE. One possible route to proving this would be to adapt our gap-preserving 
compression framework to the commuting-operator setting by showing that each of the steps (question 
reduction, answer reduction, and parallel repetition) remain sound against commuting-operator strategies. 
Using the connection established in [FNT14], this would imply that the operator norm over the maximal C* 
algebra C* (Fy x F2), where Fp is the free group on two elements, is uncomputable. 

Another interesting open question concerns the zero gap variants of MIP* and MIP°°, which we denote 
by MIP% and MIP®°, respectively. These classes capture the complexity of deciding whether a nonlocal game 
6 has entangled value (or commuting-operator value respectively) exactly equal to 1. In [Slo19a], Slofs- 
tra shows that there is an efficient reduction from Turing machines M to nonlocal games 6 m such that 
M does not halt if and only if val* (6m) = val® (6m) = 1. This implies that coRE = MIP§?(2,1) 
and furthermore coRE C MIP9. However, since RE C MIP*(2,1) C MIP5(2,1), this implies that 
MIP (2, 1) is strictly bigger than both RE and coRE. In fact, it was shown by Mousavi, et al. [MNY20] 
that the class MIP§(3,1) (the complexity class corresponding to determining whether val*(6) = 1 for 
three-player nonlocal games) is equal to TI}, which is the set of all languages L such that x € L if and 
only if Vy, 3z R(x,y,z) = 1 for a computable function R that depends on L.'* Thus it is plausible that 
the complexity landscape of nonlocal games looks like the following: MIP* = RE, MIP5 = T, and 
MIP® = MIPẸ = coRE. Such statements about the complexity of MIP* versus MIP®®, in both the 
gapped and zero-gap cases, may reveal additional insights into the difference between the tensor product 
and commuting-operator models of correlations. 


Acknowledgments. We thank Matthew Coudron, William Slofstra and Jalex Stark for enlightening dis- 
cussions regarding possible consequences of our work. We thank William Slofstra and Jalex Stark for 


!2RE Æ coRE follows from the fact that RE N coRE is the set of decidable languages and RE contains undecidable languages. 

13We note that the “co” modifier on both sides of the equation MIP? = coRE refer to different things! 

14The class T19 is also characterized as being part of the second level of the arithmetical hierarchy from computability theory, 
where RE = zo and coRE = TI? form the first level. 


15 


suggestions regarding explicit separations between Cga and Cgc. We thank Peter Burton, William Slofstra 
and Jalex Stark for comments on a previous version. We thank Lewis Bowen and Mikael de la Salle for 
pointing out typos and minor errors in a previous version. 

Zhengfeng Ji is supported by Australian Research Council (DP200100950). Anand Natarajan is sup- 
ported by IQIM, an NSF Physics Frontiers Center (NSF Grant PHY-1733907). Thomas Vidick is supported 
by NSF CAREER Grant CCF-1553477, AFOSR YIP award number FA9550-16-1-0495, a CIFAR Azrieli 
Global Scholar award, MURI Grant FA9550-18-1-0161 and the IQIM, an NSF Physics Frontiers Center 
(NSF Grant PHY-1125565) with support of the Gordon and Betty Moore Foundation (GBMF-12500028). 
Henry Yuen is supported by NSERC Discovery Grant 2019-06636. Part of this work was done while John 
Wright was at the Massachusetts Institute of Technology. He is supported by IQIM, an NSF Physics Fron- 
tiers Center (NSF Grant PHY-1733907), and by ARO contract W911NF-17-1-0433. 


16 


2 Proof Overview 


In this section we give an overview of the proof of the inclusion RE C MIP*. Since all interactive proof 
systems considered in the paper involve a single-round interaction between a classical verifier and two 
quantum provers sharing entanglement we generally use the language of nonlocal games to describe such 
proof systems, and often refer to the provers as “players”. In a nonlocal game 6 (or simply “game” for 
short), the verifier can be described as the combination of two procedures: a question sampling procedure 
that samples a pair of questions (x, y) for the players according to a distribution y4 (known to the verifier and 
the players), and a decision procedure (also known to all parties) that takes as input the players’ questions 
and their respective answers a, b and evaluates a predicate D(x, y,4, b) € {0, 1} to determine the verifier’s 
acceptance or rejection. Given a description of a nonlocal game 6, recall that val*(@) denotes the entangled 
value of the game, which is defined as the supremum (6) of the players’ success probability in the game over 
all finite-dimensional tensor product strategies. (We refer to Section 5 for definitions regarding nonlocal 
games.) 

Our results establish the existence of transformations on families of nonlocal games {6 }nen hav- 
ing certain properties. In order to keep track of efficiency (and ultimately, computability) properties it is 
important to have a way to specify such families in a uniform manner. Towards this we introduce the fol- 
lowing formalism. A uniformly generated family of games is specified through a pair of Turing machines 
VY = (S, D) that satisfy certain conditions, in which case the pair is called a normal form verifier. The Tur- 
ing machine S (called a sampler) takes as input an index n € JN and returns the description of a procedure 
that can be used to sample questions (x,y) in the game (this procedure itself obeys a certain format asso- 
ciated with “conditionally linear” distributions, defined below). The Turing machine D (called a decider) 
takes as input an index n, questions (x, y), and answers (a,b), and returns a single-bit decision. For the sake 
of this proof overview we assume that the sampling and decision procedures run in time polynomial in the 
index n; we refer to the running time of these procedures as the complexity of the verifier. Given a normal 
form verifier V = (S,D) we associate to it an infinite family of nonlocal games {V, } indexed by positive 
integers in the natural way. 

The main technical result of this paper is a gap-preserving compression transformation on normal form 
verifiers. The following theorem presents an informal summary of the properties of this transformation. 
Recall that for a game 6 and probability 0 < p < 1, £ (6, p) denotes the minimum local dimension of an 
entangled state shared by the players in order for them to succeed in 6 with probability at least p. 


Theorem 2.1 (Gap-preserving compression of normal form verifiers, informal). There exists a polynomial- 
time Turing machine Compress that, when given as input the description of a normal form verifier V = 
(S,D), outputs the description of another normal form verifier V! = (S',D’) that satisfies the following 
properties: forall n € N, letting N = 2", 


1. (Completeness) If val* (Vy) = 1 then val*(V}) = 1. 


2. (Soundness) If val* (Yy) < 4 then val*(V},) < 5. 


3. (Entanglement lower bound) E (V}, 5) > max{&é(Vn, Hamer 


The formal version of this theorem is stated in Section 12 as Theorem 12.1. The terminology compression is 
motivated by the fact, implicit in the informal statement of the theorem, that the time complexity of the ver- 
ifier’s sampling and decision procedures in the game V’, which is polynomial in n, is exponentially smaller 
than the time complexity of the verifier in the game Vy, which is polynomial in N and thus exponential in 
n. 


17 


Before giving an overview of the proof of Theorem 2.1 we sketch how the existence of a Turing machine 
Compress with the properties stated in the theorem implies the inclusion RE C MIP*. Recall that the 
complexity class RE consists of all languages L such that there is a Turing machine M that accepts instances 
x in L, and does not accept instances x that are not in L (but is not required to terminate on such instances). 
To show RE C MIP* we give an MIP* protocol for the Halting Problem, which is a complete problem for 
RE. The Halting Problem is the language that consists of all Turing machine descriptions M such that M 
halts when run on an empty input tape. (For the purposes of this overview, we blur the distinction between a 
Turing machine and its description as a string of bits.) We give a procedure that given a Turing machine M 
as input returns the description of a normal form verifier VM = (S M DM) with the following properties. 
First, if M does eventually halt on an empty input tape, then it holds that for all n € N, val*(V™) = 1. 
Second, if M does not halt then for all n € IN, val* (VM) < 4. 

We describe the procedure that achieves this. Informally, the procedure returns the specification of a 
verifier VM = (SM, DM) such that DM proceeds as follows: on input (n,x,y,a,b) it first executes the 
Turing machine M for n steps. If M halts, then D accepts. Otherwise, DM computes the description 
of the compressed verifier V’ = (S’,D’) that is the output of Compress on input VM, then executes the 
decision procedure D' (n, x, y,a,b) and accepts if and only if D’ accepts.'° 

To show that this procedure achieves the claimed transformation, consider two cases. First, observe that 
if M eventually halts in some number of time steps T, then by definition val* (VM) = 1 fo alln > T. 
Using Theorem 2.1 along with an inductive argument it then follows that val* (VM) = 1 forall n > 1. 
Second, if M never halts, then observe that for any n > 1 Theorem 2.1 implies two separate lower bounds 
on the amount of entanglement required to win the game yM with probability at least 5: the dimension 


is (a) at least 220, and (b) at least the dimension needed to win the game vM with probability at least 
Z. Applying an inductive argument it follows that an infinite amount of entanglement is needed to win the 
game V,„ with any probability greater than Z. Thus, a sequence of finite-dimension strategies for V„ cannot 
lead to a limiting value larger than 5, and val* (VM) < 4. 

We continue with an overview of the ideas behind the proof of Theorem 2.1. 


Compression by introspection. To start, it is useful to review the protocol introduced in [NW19] to 
show the inclusion NEEXP C MIP*. Fix an NEEXP-complete language L. The MIP* protocol for NEXP 
from [NV18b], when scaled up to decide languages from NEEXP, yields a family of nonlocal games {6, } 
that are indexed by instances z € {0,1}*. The family of games decides L in the sense that for all z, the 
game 6, has entangled value 1 if z € L, and has entangled value at most 5 if z ¢ L. Furthermore, if 
n = |z| is the length of z, the verifier of the game 6, has complexity poly(N) = exp(|z|) (recall that we 
use this terminology to refer to an upper bound on the running time of the verifier’s sampling and decision 
procedure). Thus, this family of games does not by itself yield an MIP* protocol for L. To overcome this the 
main contribution in [NW 19] is the design of an efficient compression procedure Com press that applies 
specifically to the family of games {6,}. When given as input the description of 6,, Com press returns 
the description of a game 6! such that if val*(G,) = 1, then val*(6!) = 1, and if val*(6,) < 4, then 
val* (61) < Z. Furthermore, the complexity of the verifier for 64 is poly(n). Thus the family of games 
{64} decides L and this shows that NEEXP C MIP*, which is the best lower bound known on MIP* prior 
to our work. 

Presented in this way, it is natural to suggest iterating the procedure Com press to achieve e.g. the 
inclusion NEEEXP C MIP*. To explain the difficulty in doing so, we give a little more detail on the 


15The fact that the decider DM can invoke the Compress procedure on itself follows from a well-known result in computability 
theory known as Kleene’s recursion theorem (also called Roger’s fixed point theorem) [Kle54, Rog87]. 


18 


compression procedure. It consists of two main steps: starting from 6,, perform (1) question reduction, and 
(2) answer reduction. The goal of (1) is to reduce the length of the questions generated by the verifier in 
6, from poly(N) to poly(). The goal of (2) is to achieve the same with respect to the length of answers 
expected from the players. Furthermore, the complexity of the verifier of the resulting game 6! should be 
reduced from poly(N) to poly(n). 

Part (1) is achieved through a technique referred to as “introspection” where, rather than sampling ques- 
tions (x,y) of length poly(N) as in the game 6,, the verifier instead executes a carefully crafted nonlo- 
cal game with the players that (a) requires questions of length poly(7), (b) checks that the players share 
poly(N) EPR pairs, and (c) checks that the players measure the EPR pairs in such a way as to sample for 
themselves a question pair (x,y) such that one player gets x and the other player gets y. In other words, the 
players are essentially forced to introspectively ask themselves the questions of 6,. 

After question reduction, the players still respond with poly(N)-length answers, which the verifier 
has to check satisfies the decision predicate of the original game 6,. The goal of Part (2) is to enable 
the decision procedure to implement the verification procedure while not requiring the entire full-length 
answers from the players. In the answer reduction scheme of [NW 19] this is achieved by having the verifier 
run a probabilistically checkable proof (PCP) with the players so that they succinctly prove that first, they 
have introspected questions (x,y) from the correct distribution, and second, that they are able to generate 
poly(N)-length answers (a,b) that would satisfy the decision predicate of the original game 6, when 
executed on (x,y) and (a,b). Since the questions and answers in the PCP are of length poly(1), this 
achieves the desired answer length reduction. 

Iterating this scheme presents a number of immediate difficulties that have to do with the fact that the 
sampling and decision procedures of the verifier in 6! do not have such a nice form as those in 6,. First 
of all, the compression procedure of [NW19] can only “introspect” a specific question distribution of a 
nonlocal game from [NV18b]; we call this distribution a “low-degree test distribution”.'° However, the 
resulting question distribution of the question-reduced verifier, which is used to check the introspection, 
has a much more complex structure. A similar issue arises with the modifications required to perform 
answer reduction. In the PCP employed to achieve this the question distribution appears to be much more 
complex than the low-degree test distribution (this is in large part due to the need for a specially tailored PCP 
procedure that encodes separately different chunks of the witness, corresponding to answers from different 
players). As a result it is entirely unclear at first whether the question distribution used by the verifier in 64 
can be “‘introspected” for a second time. 

To overcome these difficulties we identify a natural class of question distributions, called conditionally 
linear distributions, that generalize the low-degree test distribution. We show that conditionally linear distri- 
butions can be “‘introspected” using conditionally linear distributions only, enabling recursive introspection. 
(In particular, they are a rich enough class to capture the types of question distributions produced by the com- 
pression scheme of [NW19].) We define normal form verifiers by restricting their sampling procedure to 
generate conditionally linear question distributions, and this allows us to obtain the compression procedure 
on normal form verifiers described in Theorem 2.1. 

Conceptually, the identification of a natural class of distributions that is “closed under introspection” is 
a key step that enables the introspection technique to be applied recursively. (As we will see later, other 


'6The nonlocal game of [NV18b] is part of a more general family of games called “low-degree tests”, which have been studied 
extensively for their applications in complexity theory [BFL91], property testing [RS96], and PCPs [AS98]. Generally, the question 
distribution of a low-degree test is a random question pair (x,y) where y is a randomly chosen affine subspace in F” and x is a 
uniformly random point on y, where F is a finite field and m > 2 is an integer. The specific low-degree test game in [NV18b] 
(which is based on the low-degree test of [RS97]) uses two-dimensional subspaces; thus the question distribution of [RS97, NV18b] 
is referred to as the “plane-point distribution”. 


19 


closure properties of conditionally linear distributions, such as taking direct products, play an important role 
as well.) Since conditionally linear distributions are central to our construction we describe them next. 


Conditionally linear distributions. Fix a vector space V that is identified with F”, for a finite field IF and 
integer m. Informally (see Definition 4.1 for a precise definition), a function L on V is conditionally linear 
(CL for short) if it can be evaluated by a procedure that takes the following form: (i) read a substring z) 
of z; (ii) evaluate a linear function Lı on z0); (iii) repeat steps (i) and (ii) with the remaining coordinates 
z\z®, such that the next steps are allowed to depend in an arbitrary way on Lj (z)) but not directly on 
z(1) itself. What distinguishes a function of this form from an arbitrary function is that we restrict the 
number of iterations of (i)—(ii) to a constant number (at most 9, in our case). (One may also think of 
CL functions as “adaptively linear” functions, where the number of “levels” of adaptivity is the number of 
iterations of (i)—(ii).) A distribution over pairs (x,y) € V x V is called conditionally linear if it is the 
image under a pair of conditionally linear functions Lê, LB : V — V of the uniform distribution on V, i.e. 
(x,y) ~ (LA(z), L®(z)) for uniformly random z € V. 

An important class of CL distributions are low-degree test distributions, which are distributions over 
question pairs (x,y) where y is a randomly chosen affine subspace of F” and x is a uniformly random 
point on y. We explain this for the case where the random subspace y is one-dimensional (i.e. a line). Let 
V = Vx © Vy where Vx = Vy = F". Let Lô be the projection onto Vx (i.e. it maps (x,v) — (x,0) 
where x € Vx and v € Vy). Define L? : V  V as the map (x,v) ++ (LI^ (x), v) where LEN : Vx > Vx 
is a linear map that, for every v € Vy, projects onto a complementary subspace to the one-dimensional 
subspace of Vx spanned by v (one can think of this as an “orthogonal subspace” to the span of {v}). L® is 
conditionally linear because it can be seen as first reading the substring v € Vy (which can be interpreted as 
specifying the direction of a line), and then applying a linear map LE" to x € Vx (which can be interpreted 
as specifying a canonical point on the line £ = {x + tv : t € F}). It is not hard to see (and shown formally 
in Section 7.1.1) that the distribution of (L“(z), LP (z) ) for z uniform in V, is identical (up to relabeling) to 
the low-degree test distribution (x, £) where £ is a uniformly random affine line in F”, and x is a uniformly 
random point on £. 

Our main result about CL distributions, presented in Section 8, is that any CL distribution 4, associated 
with a pair of CL functions (Lê, LB) over a linear space V = F", can be “introspected” using a CL 
distribution that is “exponentially smaller” than the initial distribution. Slightly more formally, to any CL 
distribution u we associate a two-player game 6, (called the “introspection game”) in which questions 
from the verifier are sampled from a CL distribution yi’ over F” for some m’ = poly log(m) and such 
that in any successful strategy for the game 6,,, when the players are queried on a special question labeled 
INTRO, they must respond with a pair (x,y) that is approximately distributed according to u. (The game 
allows us to do more: it allows us to conclude how the players obtained (x, y) — by measuring shared EPR 
pairs in a specific basis — and this will be important when using the game as part of a larger protocol that 
involves other checks.) Crucially for us, the distribution ji’ only depends on a size parameter associated with 
(LA, LB) (essentially, the integer m together with the number of “levels” of adaptivity of L“ and LB), but 
not on any other structural property of (LÂ, LE). Only the decision predicate for the introspection game 6 u 
depends on the entire description of (LÂ, LB). 

We say a few words about the design of ji’ and the associated introspection game, which borrow heavily 
from [NW 19]. Building on the “quantum low-degree test” introduced in [NV 18a] it is already known how a 
verifier can force a pair of players to measure m EPR pairs in either the computational or Hadamard basis and 
report the (necessarily identical) outcome z obtained, all the while using questions of length polylogarithmic 
in m only. The added difficulty is to ensure that a player obtains, and returns, precisely the information about 


20 


z that is contained in L4(z) (resp. L®(z)), and not more. A simple example is the line-point distribution 
described earlier: there, the idea to ensure that e.g. the “point” player only obtains the first component, x 
of (x,v) € Vx ® Vy, the verifier demands that the “point” player measures their qubits associated with 
the space Vy in the Hadamard, instead of computational, basis; due to the uncertainty principle this has the 
effect of “erasing” the outcome in the computational basis. The case of the “line” player is a little more 
complex: the goal is to ensure that, conditioned on the specification of the line received by the “line” 
player, the point x received by the “point” player is uniformly random within £. This was shown to be 
possible in [NW19]. 

We can now describe how samplers of normal form verifiers are defined: these are Turing machines 
S that specify an infinite family of CL distributions {un } by computing, for each index n, the CL func- 
tions (LA m LP”) associated with un. (See Definition 4.14 for a formal definition of samplers.) Thus, the 
question distributions of a normal form verifier V = (S, D) are the CL distributions corresponding to S. 


Question reduction. Just like the compression procedure of [NW 19], the compression procedure Compress 
of Theorem 2.1 begins with performing question reduction on the input game. Given a normal form verifier 
VY = (S,D), the procedure Compress first computes a normal form verifier VINTRO = (SINTRO, DINTO) 
where for all n € IN, the game VINTO consists of playing the original game Vy where N = 2", except 
that instead of sampling the questions according to the CL distribution uy specified by the sampler Sy, 
the verifier executes the introspection game 6, described in the previous subsection. Thus, in the game 
VINTRO, when both players receive the question labeled INTRO they are expected to sample (x,y) respec- 
tively according to jin, and respond with the sampled question together with answers a, b respectively. The 
decider D!NT®° on index n evaluates D(N,x,y,a,b) and accepts if and only if D accepts. As a result the 
time complexity of decider D'NT®° on index n remains that of D, i.e. poly(N). However, the length of 
questions asked in V/NT®° and the complexity of the sampler S!XT®° are exponentially reduced, to poly (n). 

For convenience we refer to the questions asked by the verifier in the “question-reduced” game VINTRO 
as “small questions,” and the questions that are introspected by the players in V!NT®° (equivalently, the 
questions asked in the original game Vy) as “big questions.” 


Answer reduction. Having reduced the complexity of the question sampling, the next step in the compres- 
sion procedure Compress is to reduce the complexity of decider D!N'®° from poly(N) to poly(1) (which 
necessarily implies reducing the answer length to poly(n)). To achieve this the compression procedure 
computes a normal form verifier VA® = (SA®, DA) from V!NT®° such that both the sampler and decider 
complexity in VAR are poly() (here, AR stands for “answer reduction”). 

Similarly to the answer reduction performed in [NW19], at a high level this is achieved by composing 
the game VINTR° with a probabilistically checkable proof (PCP). In our context a PCP is a proof encoding 
that allows a verifier to check whether, given Turing machine A and time bound T provided as input, there 
exists some input x that A accepts in time T. The PCP proof can be computed from A, T, and the accepting 
input (if it exists) and has length polynomial in T and the description length |A| of A. Crucially, the 
verifier can check a purported proof while only reading a constant number of symbols of it, each of length 
polylog(T, |A|), and executing a verification procedure that runs in time polylog(T, |A|). 

We use PCPs for answer reduction as follows. The verifier in the game VAR samples questions as 
would and sends them to the players. Instead of receiving the introspected questions and answers (x, y, a, b) 
for the original game Vy and running the decision procedure D(N, x, y,a, b), the verifier instead asks the 
players to compute a PCP II for the statement that the original decider D accepts the input (N, x, y,a,b) in 
time T = poly(N). The verifier then samples additional questions for the players that ask them to return 


INTRO 
Vn 


21 


specific entries of the proof II. Finally, upon receipt of the players’ answers, the verifier executes the PCP 
verification procedure. Because of the efficiency of the PCP, both the sampling of the additional questions 
and the decision procedure can be executed in time poly().!” 

This very rough sketch presents some immediate difficulties. A first difficulty is that in general no player 
by themselves has access to the entire input (N, x, y, a,b) to D, so no player can compute the entire proof I. 
We discuss this issue in the next paragraph. A second difficulty is that a black-box application of an existing 
PCP, as done in [NW 19], results in a question distribution for VAR (i.e. the sampling of the proof locations 
to be queried) that is rather complex — and in particular, it may no longer fall within the framework of 
CL distributions for which we can do introspection. To avoid this, we design a bespoke PCP based on the 
classical MIP for NEXP (in particular, we borrow and adapt techniques from [BSS05, BSGH*06]). Two 
essential properties for us are that (i) the PCP proof is a collection of several low-degree polynomials, two of 
which are low-degree encodings of each player’s big answer in the game V!NT®°, and (ii) verifying the proof 
only requires (a) running low-degree tests, (b) querying all polynomials at a uniformly random point, and 
(c) performing simple consistency checks. Property (i) allows us to eliminate the extra layer of encoding 
in [NW19], who had to consider a PCP of proximity for a circuit applied to the low-degree encodings of 
the players’ big answers. Property (ii) allows us to ensure that the question distribution employed by VAR 
remains conditionally linear. 


Oracularization. The preceding paragraph raises a non-trivial difficulty. In order for the players to com- 
pute a proof for the claim that D(N, x, y,a,b) = 1 they need to know the entire input (x, y, a,b). However, 
in general a player only has access to their own question and answer: one player only knows (x,a) and 
the other player knows (y,b). The standard way of circumventing this difficulty is to consider an “orac- 
ularized” version of the game, where one player gets both questions (x,y) and is able to determine both 
answers (a,b), while the other player only gets one of the questions at random, and is only asked for one of 
the answers, that is then checked for consistency with the first player’s answer. 

While this technique works well for games with classical players, when the players are allowed to use 
quantum strategies using entanglement oracularization does not, in general, preserve the completeness prop- 
erty of the game. To ensure that completeness is preserved we need an additional property of a completeness- 
achieving strategy for the original game: that there exists a commuting and consistent strategy on all pairs of 
questions (x,y) that are asked in the game with positive probability. Here commuting means that the mea- 
surement {A7}, performed by the player receiving x commutes with the measurement {BY }p performed by 
the player receiving y.'* Consistent means that if both players perform measurements associated with the 
same question they obtain the same answer. If both properties hold then in the oracularized game when one 
player receives a pair (x, y) and the other player receives the question x (say), the first player can simultane- 
ously measure both {A}, and {B/}, on their own space to obtain a pair of answers (a,b), and the second 
player can measure { A*}, to obtain a consistent answer 4. 

For answer reduction to be possible it is thus applied to the oracularized version of the introspection 
game VINTRO, This in turn requires us to ensure that the introspection game V/NTR° has a commuting and 
consistent strategy achieving value 1 whenever it is the case that val*(VJNTR°) = 1. For this property to 
hold we verify that it holds for the initial game that is used to seed the compression procedure (this is true 
because we can start with an MIP* protocol for NEXP for which there exists a perfect classical strategy) 


'7 This idea is inspired by the technique of composition in the PCP literature, in which the complexity of a verification procedure 
can be reduced by composing a proof system (often a PCP itself) with another PCP. 

'8We stress that the commuting property only applies to question pairs that occur with positive probability, and does not mean 
that all pairs of measurement operators are required to commute; indeed this would imply that the strategy is effectively classical. 


22 


and we also ensure that each of the transformations of the compression protocol (question reduction, answer 
reduction, and parallel repetition described next) maintains it. 


Parallel repetition. The combined steps of question reduction (via introspection) and answer reduction 
(via PCP composition) result in a game V8 such that the complexity of the verifier is poly(). Further- 
more, if the original game Vy has value 1, then VÊR also has value 1. Unfortunately the sequence of 
transformations incurs a loss in the soundness parameters: if val* (Vyn) < Z, then we can only establish that 
val* (VAR) < 1 — C for some positive constant C < 5 (we call C the soundness gap). Such a loss would 
prevent us from recursively applying the compression procedure Compress an arbitrary number of times, 
which is needed to obtain the desired complexity results for MIP*. 

To overcome this we need a final transformation to restore the soundness gap of the games after answer 
reduction to a constant larger than Z. To achieve this we use the technique of parallel repetition. The parallel 
repetition of a game & is another nonlocal game 6*, for some number of repetitions k, which consists of 
playing k independent and simultaneous instances of 6 and accepting if and only if all k instances accept. 
Intuitively, parallel repetition is meant to decrease the value of a game 6 exponentially fast in k, provided 
val*(6) < 1 to begin with. However, it is an open question of whether this is generally true for the 
entangled value val”. 

Nevertheless, some variants of parallel repetition are known to achieve exponential amplification. We 
use a variant called “anchored parallel repetition” and introduced in [BVY17]. This allows us to devise 
a transformation that efficiently amplifies the soundness gap to a constant. The resulting game VRE? has 
the property that if val*(VA®) = 1, then val*(VR=?) = 1 (and moreover this is achieved using a com- 
muting and consistent strategy), whereas if val* (VAR) < 1 — C for some universal constant C > 0 then 
val* (VRE?) < 7 Furthermore, we have the additional property, essential for us, that good strategies in 
the game VRE? require as much entanglement as good strategies in the game V8 (which in turn require 
as much entanglement as good strategies in the game Vy). The complexity of the verifier in VRE? remains 
poly(n). 

The anchored parallel repetition procedure, when applied to a normal form verifier, also yields a normal 
form verifier: this is because the direct product of CL distributions is still conditionally linear. 


Putting it all together. This completes the overview of the transformations performed by the compres- 
sion procedure Compress of Theorem 2.1. To summarize, given an input normal form verifier V, question 
reduction is applied to obtain Y!NT®°, answer reduction is applied to the oracularized version of VINTRO to 
obtain VAR, and anchored parallel repetition is applied to obtain VR®?, which is returned by the compression 
procedure. Each of these transformations preserves completeness (including the commuting and consistent 
properties of a value-1 strategy) as well as the entanglement requirements of each game; moreover, the 
overall transformation preserves soundness. 


23 


3 Preliminaries 


Notation. We use È to denote a finite alphabet. IN is the set of positive integers. For w € {0,1}, W 
denotes 1 — w. For w € {A,B}, W = B if w = A and W = A otherwise. (For notational convenience we 
often implicitly make the identifications 1 ++ A and 2 + B.) We use F to denote a finite field. We write 
My (IF) to denote the set of n x n matrices over F. We write I to denote the identity operator on a vector 
space. We write Tr(-) for the matrix trace. We write H to denote a separable Hilbert space. For a linear 
operator T, ||T|| denotes the operator norm. 


Asymptotics. All logarithms are base 2. We use the notation O(-), poly(-), and polylog(-) in the fol- 
lowing way. For f,g : N — R+ we write f(n) = O(g(n)) (omitting the integer n when it is clear from 
context) to mean that there exists a constant C > 0 such that for all n € N, f(n) < Cg(n). When we 
write f (a1,..., ag) = poly(a,...,a,), this indicates that there exists a universal constant C > 0 (which 
may vary each time the notation is used in the paper) such that f (a1, ..., ap) < C(a,-++ax)© for all pos- 
itive a1,...,a,. Similarly, when we write f(a1,...,a,) = polylog(a1,...,a,), there exists a universal 
constant C such that f(a1,..., ap) < c$; log©(1 + a;) for all positive a,,...,ax. Finally, we write 
log(ai,...,ax) as short hand for JJ% log(1 + a;).!° 


3.1 Turing machines 


Turing machines are a model of computation introduced in [Tur37], and play a central role in our modeling 
of verifiers for nonlocal games. Here we give an overview of certain aspects of Turing machines that are 
relevant for this paper. For an in-depth treatment of Turing machines, we refer the reader to the textbooks 
of Papadimitriou [Pap94] or Sipser [Sip12]. 

The tapes of a Turing machine are infinite one-dimensional arrays of cells that are indexed by natural 
numbers. A k-input Turing machine M has k input tapes, one work tape, and one output tape. Each cell of 
a tape has symbols taken either from the set {0,1} or the blank symbol LI. At the start of the execution of a 
Turing machine, the work and output tapes are initialized to have all blank symbols. A Turing machine halts 
when it enters a designated halt state. The output of a Turing machine, when it halts, is the binary string that 
occupies the longest initial stretch of the output tape that does not have a blank symbol. If there are only 
blank symbols on the output tape, then by convention we say that the Turing machine’s output is 0. 

Every k-input Turing machine M computes a (partial) function f : ({0,1}*)* + {0,1}* where the 
function is only defined on subset S C ({0,1}* jk of inputs x on which M halts. We use M (x1, x2, . . -, Xk) 
to denote the output of a k-input Turing machine M when x; € {0,1}* is written on the i-th input tape for 
i € {1,2,...,k}. If M does not halt on an input x, then we define M(x) to be L. 

The time complexity of a Turing machine M on input x = (x1, %2, . . ., Xk), denoted by TIME m, x, is 
the number of time steps that M takes on input x before it enters its designated halt state; if M never halts 
on input x, then we define TIME m,y = œ. 

Every Turing machine M has a canonical encoding as a bit string M € {0,1}*, called the canonical 
description of M. The canonical encoding describes the finite number of states and transition rules of M. 
The details of this encoding are not important for this paper; we assume that some “reasonable” encoding 
scheme is used where the bit length of the encoding is at most some fixed polynomial in the number of 
states. The length of the description M is denoted by |M|. For every integer k € N and every string 
a € {0,1}*, the k-input Turing machine described by « is denoted [a] . 


19The additional 1 in the argument of the log(-) is to ensure that this quantity is strictly positive. 


24 


Throughout this paper we frequently give high-level descriptions of Turing machines in English lan- 
guage, rather than explicitly describing them in terms of states and transition functions. In doing so we 
implicitly assume that all such descriptions can be formally converted into a description in terms of states 
and transition functions such that there is no hidden blow-up in the complexity of the representation. 

We elaborate on several properties of Turing machines that are implicitly used throughout the paper. 


Turing machine simulation. We frequently describe Turing machines as running or simulating other 
Turing machines as “subroutines”. We assume that such simulations can be performed with a polynomial 
overhead in terms of the time complexity. For example, suppose A is a 1-input Turing machine that has the 
following high-level description: on input x (interpreted as an integer) it runs a 2-input Turing machine 6 on 
the input (x, x?) and obtains an output y (if G halts), and then finally A returns the output y. We assume that 
A can efficiently simulate the Turing machine 5, even though G has a different number of input tapes. This 
is because every k-input Turing machine M can be efficiently simulated by a single tape Turing machine, as 
given by the following result [HS65]. We assume that there is an unambiguous, binary encoding enc; (x) of 
k-tuples x € ({0,1}*)* that is computable in time O(k + |x1| +--+ + |x|) where x; is the i-th component 
of the tuple x.7° 


Theorem 3.1 (Efficient universal Turing machine [HS65]). For all k € IN there exists a single tape Turing 
machine Uy (called a universal Turing machine) with such that for every x € ({0,1}*)* and a € {0,1}*, 
if the tape of the Turing machine Ux is initialized with encp41(&, x1, ..., Xk) and [0], halts within T steps 
on input x then Uj, halts in at most C(k - |a| - |x| -T)° steps and the contents of its tape is [x|,(x). Here, 
C,c > 1 are universal constants, and |x| denotes the sum of lengths |x1| +--+ + |xx]. 


Thus, the Turing machine A can simulate B on input (x, x?) as follows: it first computes the encoding 
enc3 (B, x, x?) and writes this on a segment of the work tape. Then, it runs the Turing machine U% from 
Theorem 3.1 on this segment of the work tape, obtaining output y = B(x, x?) (if it eventually halts). Then, 
it writes y onto the output tape of A. The total time complexity of A on input x then includes the time 
complexity of computing the encoding, running Turing machine Ug, and writing the final output. 

We also assume that the length of the canonical description of A depends only linearly on the canonical 
description of the Turing machine B, so that |A| = O(|B]|); the constant in the O(-) notation hides the 
dependence on the description of the universal Turing machine Ug, the computation of the encoding map 
enc3(-), etc. 


Hardwiring constants. Throughout this paper, we often define Turing machines M with some num- 
ber k of inputs, and then for some string a € {0,1}* define a (k — 1)-input Turing machine M, whose 
behavior on input (y1,...,Y,_1) is to execute M on input (a, y1, ..-,Yk—-1). Informally, the Turing ma- 
chine M, “hardwires” the input a onto the first tape of M. We informally write Ma(yi,...,Yr-1) = 
M (a, Y1,- --,Yk-1)- 

More precisely, the Turing machine M, on input (y1,...,Y,—1) first computes the encoding e = 
enck4ı( M,a, y1,- .-,Yk—-1), and then runs the universal Turing machine Uy on input e. The description 
of M, can be computed from M and a in time O(|a| + |M| + O(1)), where the O(1) comes from the 
description length of the universal Turing machine Ug. 

Furthermore, the time complexity of the Turing machine M; on input (y1, . - -, Yk—1 ) is at most 
poly(|a|, |y1|,---,|Yx—1|, T) where T is the time complexity of M on input (a, y1, . - -, Yk-1). This bound 


20A canonical choice of such an encoding is the following: a tuple (x1,...,2X,) is in encoded into a concatenation of a “dual 
rail” encoding of each x;: every bit of x; is expanded via the map 0 ++ 01, 1++ 10, and the end of the string x; is indicated by 00. 


25 


comes from the complexity of encoding the inputs from the k tapes into the enc;,;1(-) format, and then 
simulating M on input (M,a, y1,. - -, Yk—1)- 


Description complexity bounds. When we bound the description lengths of the Turing machines pre- 
sented in this paper, we do not worry about the exact details of how Turing machines are represented as 
binary strings — as mentioned we assume that some reasonable encoding is used — but instead we distinguish 
between whether the Turing machine description depends on any parameters used elsewhere in the paper. 
Note that whether the description depends on any parameters is different from whether the Turing machine 
takes parameters as input. For example, consider the following (high-level) descriptions of Turing machines 


A and B: 


1. Turing machine A takes two inputs (a, x) and outputs the integer « x x where «, x are interpreted as 
positive integers written in binary. 


2. Turing machine B takes one input x and outputs the integer 6 x x where £ is a fixed positive integer. 


The description of A does not depend on any “external” parameters, so its description length |A| is 
some universal constant, which we express as O(1). On the other hand, the description of B depends on 
some fixed integer f; in other words, the parameter f is “hard-wired” into B’s description. The description 
of B can be taken to be the following: B (which is a 1-input Turing machine) simulates the execution of the 
2-input Turing machine A where the first input tape of A has the binary representation of 6 written onto 
it and the second input tape has x (which is provided as input to B) written into it. Thus the description of 
B includes the binary representation of f, the description of A (whose length is a universal constant), and 
the description of the universal Turing machine (whose length is a universal constant). Thus the description 


length of B is O(log 6) + O(1) = O(log £). 


Time complexity bounds. When we bound the time complexity of the Turing machines presented in this 
paper, we similarly do not worry about the exact implementation details, but instead we assume that basic 
computations such as integer arithmetic, string comparisons, etc. are all performed using reasonably efficient 
algorithms that run in polynomial time in the length of the input. When running other Turing machines as 
subroutines, we assume that there is some polynomial overhead due to Theorem 3.1. 


Timeout counters. We sometimes define Turing machines that are required to halt if either the Turing ma- 
chine or some subroutine takes a number of time steps that exceeds some specified threshold. For example, 
we may write “Let M be the Turing machine that on input the description of a Turing machine R and an 
integer T, simulates R on the empty tape and halts if more than T steps are performed.” 

This is useful for establishing an a priori bound on the time complexity of the Turing machine. In 
particular we will be able to claim that M, as described above, always runs in time at most poly (T, |R 
irrespective of the running time of R itself. 

We explain how such a Turing machine can be realized formally. We use a variant of the universal 
Turing machine of Theorem 3.1. Using the theorem, it is straightforward to show that given a (single tape) 
Turing machine R there exists another two-tape Turing machine M’ such that, whenever the second tape 
is initialized to the binary representation of T, M’ simulates R and in-between every simulated step of R, 
decrements the second tape and checks if it reaches zero. If it does, then M’ halts and rejects. Otherwise, it 
continues. 


, 


26 


The time complexity of M’ is, by definition, at most O(T log T) for all inputs. The description length 
of M’ is at most O( R ), because this is the description size of the universal Turing machine, whose initial 
tape has been hardwired with R. To convert back to a single tape Turing machine and obtain M from M’ 
we can use Theorem 3.1 again, at the cost of a polynomial blow-up in the time complexity. 


Input representations. Although the inputs and outputs of a Turing machine are strictly speaking binary 
strings, we oftentimes slightly abuse notation and specify Turing machines that treat their inputs and out- 
puts as objects with more structure, such as finite field elements, integers, symbols from a larger alphabet, 
and so on. In this case we implicitly assume that the Turing machine specification uses a consistent con- 
vention to represent these structured objects as binary strings. Conventions for objects such as integers are 
straightforward. For representations of finite field elements, we refer the reader to Section 3.3.2. 


3.2 Linear spaces 


Linear spaces considered in the paper generally take the form V = F” for a finite field F and integer n > 1. 
In particular, when we write “let V be a linear space”, unless explicitly stated otherwise we always mean a 
space of the form F”. Let Ê = {é,é2,...,é,} denote the standard basis of V, where fori € {1,2,...,n}, 


é; = (0,...,0,1,0,...,0) 


has a 1 in the i-th coordinate and 0’s elsewhere. We write End(V) to denote the set of linear transformations 
from V to itself. 


Definition 3.2 (Register subspace). A register subspace S of V is a subspace that is the span of a subset of 
the standard basis of V.?! We often represent such a subspace as an indicator vector u € {0,1}5, where 
s = dim(V), such that if {@;,...,é@;} is the standard basis of V then S = span{é;| u; = 1}. 


Definition 3.3. Let Ê = {é;} be the standard basis of V = F”. For two vectors u = Yi, ujé;, 0 = 
yi, 0;@; in V, define the dot product 


n 
u-v=) um EF. 
i=l 


Let S be a subspace of V. The subspace orthogonal to S in V is 
S+ = {u€ V:u-v=0forallv € S}. 
Although the notation S+ does not explicitly refer to V, the ambient space will always be clear from context. 


We note that over finite fields, the notion of orthogonality does not possess all of the same intuitive 
properties of orthogonality over fields such as R or C; for example, a non-zero subspace S may be orthogonal 
to itself (e.g. span{ (1, 1) } over F2). However, the following remains true over all fields. 


Lemma 3.4. Suppose S is a subspace of V. Then dim(S) + dim(S+) = dim(V), and 
is) = 5. 


2|The use of the term “register” is meant to create an analogy for how the space of multiple qubits is often partitioned into 
“registers” containing a few qubits each. 


27 


Proof. The statement about the dimensions follows from the fact that vectors in S~ are the solution to a 
feasible linear system of equations with dim(S) linearly independent rows; this implies that the solution 
space has dimension exactly dim(V) — dim(S). Next, we argue that S C (S+)+. Letu € S. Since 
all vectors v € SŁ are orthogonal to every vector in S, in particular u, this implies that u € (S+)+. By 
dimension counting, it follows that (S+)+ = S. O 


Definition 3.5. Given a linear space V, two subspaces S and T of V are said to form a pair of complementary 


subspaces of V if 
SNT={0}, S+T=V. 


In this case, we write V = S @ T. Any x € V can be written as x = x + x! for x° € S and x! € T in 
a unique way. We refer to x° (resp. x!) as the projection of x onto S parallel to T (resp. onto T parallel to 
S). We call the unique linear map L : V — V that maps x ++ x° the projector onto S parallel to T. 


A given subspace may have many different complementary subspaces: consider the example of S = 
span{(1,1)} in F3. Different complementary subspaces include T = span{(1,0)} and T’ = span{(0,1)}. 
It is convenient to define the notion of a canonical complement of a subspace S, given a basis for S. 


Definition 3.6. Let E be the standard basis of linear space V = F”. Let F = {01,02,...,0m} C V bea set 
of m linearly independent vectors in V. The canonical complement F} of F is the set of n — m independent 
vectors defined as follows. Write v; = Lj- ajj êj. Using a canonical algorithm for Gaussian elimination 
that works over arbitrary fields, transform the m x n matrix (a; j) to reduced row echelon form (b; j). Let 
J be the set of m column indices of the leading 1 entry in each row of (0; ;). The canonical complement is 
defined as Ft = {ê; : j ¢ J}. 


Remark 3.7. We emphasize that the canonical complement F- is a set (rather than a subspace), and it is 
defined with respect to a set of linearly independent vectors F. While there are equivalent ways of defining 
a canonical complement of a subspace in a basis-independent manner, we use this particular definition 
because it makes it clear that the canonical complement of a set F is efficiently computable. 


Remark 3.8. Let Ê be the standard basis of V. Suppose subspace S is a register subspace of V spanned 
by Eo C E. Then it is not hard to verify that the canonical complement of Eo is E \ Eo and the span of the 
canonical complement coincides with S+. 


Lemma 3.9. Let S be the span of linearly independent vectors F = {0 ,...,0m} C V and let F+ be the 
canonical complement of F as defined in Definition 3.6. Let T = span(F t); Then 


S+T=V,SNT = {0}. 


Proof. To show the first item, we must show that every vector in V can be written as a sum of an element of 
S and an element of T. To do this, let A = (a; j) be the m x n matrix over F whose rows are the vectors v, 
as in Definition 3.6. Write A = UB where U is invertible and B is in reduced row echelon form. Let J be 
the set of column indices of the leading 1 entries in B, as in Definition 3.6. Now, the rows of B are linearly 
independent and span the subpsace S, and the restriction of these rows to the columns in J are still linearly 
independent and span F”. Thus, for any vector u € V, there is some linear combination v of rows of B 
such that u and v agree restricted to the columns in J. In other words, there exists v € S such that uj = 0; 
for all j € J. Hence, u = v + w for w € T, where T is the canonical complement. This shows S + T = V. 
Counting dimensions shows that necessarily SM T = {0}. 0 


28 


Definition 3.10. Let F C V be a set of linearly independent vectors. Let F+ be the canonical complement 
of F. Define the canonical linear map L € End(V) with kernel basis F as the projector onto T parallel to 
S, where S = span(F) and T = span(F+). When the basis F for S is clear from context, we refer to this 
map as the canonical linear map with kernel S. 


Definition 3.11. Let L € End(V) be a linear map, and let F be a basis for ker(L)+. Define L+ : V > V 
as the canonical linear map with kernel basis F. 


Lemma 3.12. Let L € End(V) be a linear map and F a basis for ker(L)+. Let Lt € End(V) be the 
linear map defined in Definition 3.11. Then ker(L+) = ker(L)+. 


Proof. First, to set notation, let S = span(F) = ker(L)+ and T = span(F+). By Lemma 3.9, any vector 
v € V can be uniquely decomposed as v = v’ + v”. 

Now we will show the lemma. First we show that if ker(L+) C ker(L)+. Let v € ker(L+), and write 
v = v’ + vl. By the definition of L+, it follows that vT = 0, and hence v € S = ker(L)+. 

It remains to show the other direction, that is ker(L)+ C ker(L+). Letv € ker(L)+ = S. Then 
v = vs + v! where vT = 0, and hence by the definition of Lt, LŁv = 0. Hence v € ker(L*). 


O 


3.3 Finite fields 


Let p be a prime and q = p* be a prime power. We denote the finite fields of p and q elements by Fp and 
IF, respectively. The prime p is the characteristic of field F}, and field F, is the prime subfield of F}. We 
sometimes omit the subscript and simply use F to denote the finite field when the size of the field is implicit 
from context. For general background on finite fields, and explicit algorithms for elementary arithmetic 
operations, we refer to [MP 13]. 


3.3.1 Subfields and bases 


Let q be a prime power and k an integer. The field F} is a subfield of Fy and Fy is a linear space of 
dimension k over F,. Let {ees be a basis of F gk aS a linear space over F}. Introduce a bijection «g : 


F, — Fi between F,r and Fi defined with respect to the basis {ei Fa by 


Kq : a > (a) 


where a = ey Aiei. This map satisfies several nice properties. First, the map is an isomorphism of F,- 
vector spaces: it is F}-linear and addition in Fj naturally corresponds to vector addition in Fy. Namely, for 
alla,be Et, 


Kala +b) = x(a) +Kq(b) . 
Second, multiplication by a field element in Fj corresponds to a linear map on Fi . For alla € Ex, there 
exists a matrix Ka € M,(IF,) such that for all b € F(x, 


Kq(ab) = Ka K4(b) . 


The matrix K; is called the multiplication table of a with respect to basis eae 
We extend the map x, to vectors, matrices and sets over Fy. For v = (01, 02,---,0n) E€ Fie define 


Kq(0) = (Kq(0i)) jy € Fy" - 


29 


Similarly, for matrix M = (Mj) E€ Mmnn(F,«), define 


Xq(M) = (Km,,) € Mnmk,nk(Fq) , 


the block matrix whose (i, j)-th block is the multiplication table Km, , of M; j with respect to basis fe}. 
For a set S of vectors in Fie define 


Kqg(S) = {xg(v): v E S}. 
We omit the subscript and write x and x for x, and xq respectively when q equals to p, the characteristic of 


the field. 
The trace of Fox over F; is defined as 


tr a ++ Tr(Ka) (10) 


qh—q * 
fora € Fx, where Tr(K,) is the trace of the multiplication table of a with respect to the basis {e;}. By 


definition, the trace is an F,-linear map from F, to F4. The trace is in fact independent of the choice of 
basis, which can be seen from the following equivalent definition [MP13, Definition 2.1.80]: 


k-1 
J 
tpt_,g(4) = 2i . 
J= 


A dual basis {e},e,...,e,} of {e1,€2,...,e,} is a basis such that trog (€€) = 6;; for all ij € 


{1,2,...,k}. A self-dual basis is one that is equal to its dual. If for some a € Fx the set {aT Fa forms a 
basis of F, over F,, the basis is called a normal basis. Self-dual and normal bases do not exist for all fields 
but are guaranteed to exist under certain conditions; in particular, if q = 2 and k is odd, as will be shown in 
Lemma 3.16. 

We record some convenient facts about the maps x(-) and x(-) for self-dual bases. 


Lemma 3.13. Let q be a prime power, k an integer and {e;} a self-dual basis for Fy over F}. The map 
Kq(-) corresponding to {e;} satisfies the following properties: 


1. For all x € Fk, K(x) = (trg (x21) - ++, tgt+q(xex)). 

2. For all x,y € Fx, ttg_,g(xy) = q(x) + Kq(y). 

3. For all M € Mm,n (Fz) and v € Foe we have Xq(M)kq(v) = «q(Mv). 
Proof. We show each property by direct calculation. 


1. Let xq(x) = (%1,...,xx). Then by the definition of x4, x = 1; x;e;. Multiplying both sides by e; and 
taking the trace yields tr _,,(xe;) = X; xi ttgx_,,(eie;) = xj, where the last equality is by self-duality 
of the basis. 


2. Write ka(x) = (x1, ---, Xk) and Kq(y) = (Y1, - - -, Yk). Then 
try) = rih Xiyjeiej) 
1 
= 2 xiy tr pt_,q (€ie;) 
gi 
= q(x) : xay), 
where the last equality uses the self-duality of the basis. 


30 


3. We compute the ith Fk-block of xg(Mv): 


Kq (Mv); = P Kq (Mijv;) 
1] 


= 2 Km;;Kq (vj) 
1] 


= Xq(M)ijkq(;), 
where we have applied the definition of the multiplication table K Mij and the map X,(-). 
O 


For z € F” and V, W a pair of complementary subspaces, recall from Definition 3.5 the notation z” for 
the projection of z onto V and parallel to W. 


Lemma 3.14. Let x,(-) denote the map corresponding to a basis {e;} for Fx over F}. Let V be a subspace 
of Fi with linearly independent basis {by,...,b¢} C Fe Then the following hold: 


1. Kq(V) is a subspace of Fy, 
2. {kq(eib;) };j is a linearly independent basis of x(V ) over Fg. 


3. Let V,W be complementary subspaces of Fie: Then V' = k,(V) and W' = x,(W) are comple- 


mentary subspaces of F", and furthermore for all vectors z € Fe we have Kg (= Kq (zy and 


Wy) — w 
kalz”) = Ka(z)" . 
Proof. For the first item, we first verify that x(V) is a subspace. Since V is a subspace, it contains 0 € Foe 


and therefore x,(0) = 0 is also in x(V). Let u,v € x(V). Using that x, is a bijection there exist 
u,v € Fyr such that u’ = Kq (u) and v' = x,(v). Therefore 


uw +0 = kulu) +K) = kulu +v) €x,(V), 


where the inclusion follows because V is a subspace and thus contains u + v. Finally, for all x’ € F,, for all 
v € V, we have that x’x,(v) = «q(x'v) € xg(V) where we used that V is closed under scalar multiplication 
by Fox and thus by IF, (since F; is a subfield of Fx). Thus «,(V) is closed under scalar multiplication by 
F;. 
q 

For the second item, note that an element v € V can be expressed uniquely as v = Fa vib; for 
vi € Fx. The element v; can further be written as Li v; jej Where v;j € F,. Thus v is a linear combination 
of the vectors {e;b;}, and therefore x,(v) is a linear combination of the vectors {x,(e;b;)}. To establish that 
the vectors {Kg (ej b;)} are linearly independent, suppose towards contradiction that they are not. Then there 


would exist «; ; € IF, such that at least one æ; j is nonzero and 


0= D Xi jKq (e;b;) 
Lj 


=% (E Eeee) 


i 


s (pa) 


31 


where we define 6; = yj %i,je;- Since at least one 0; ; Æ 0 and the {ej} are linearly independent over F4, 
there exists į such that 6; 4 0, which means that there is a non-trivial linear combination of the basis ele- 
ments b; that equals 0 under x,(-). Since «,(-) is injective, we get a contradiction with linear independence 
of the {b;}. 

For the third item, we observe that x,(V) and x;(W) must be complementary because x,(-) is a linear 
map as well as a bijection. Let {v1,...,0m} and {Um+1, - - -, Un } denote bases for V and W, respectively. 
Thus the set {v1,...,0,} forms a basis for Fx» and from the previous item, the set {Kq (evi) }i,; is a basis 


for Fe Furthermore, the sets {x4 (€;0;) }j, i=1,...n and {Xq(e;0;) Yj, i=m-+1,...n are bases for x,(V) and x,(W), 
respectively. 
There is a unique choice of coefficients a; ; € F} such that xq(z) = X; j &; jK (ejvi). But then 


Kq(Z) = K4 (1 Laine)? i) 
=% (Se , 


where we define aj = dij a; je;. Since Kq(-) is a bijection, this implies that z = )7; a;v;, and therefore 
zY =", ajv; (and similarly zW = X$, 41 4i0;). This implies that 
m 
v v 
Kg(z)" = Y} aije lejvi) =x,(z'), 
i=1 j 
and similarly x, (zW = kq(z'”). This completes the proof of the lemma. O 


3.3.2 Bit string representations 


As mentioned at the end of Section 3.1, we sometimes treat the inputs and outputs of Turing machines as 
representing elements of a finite field, or a vector space over a finite field. We discuss some important details 
about bit representations of finite field elements and arithmetic over finite fields. 

In the paper we only consider fields Fx where k is odd. 


Definition 3.15. A field size q is called an admissible field size if q = 2* for odd k. 


Elements of F are naturally represented using bits. To represent elements of Fx as binary strings we 
require the specification of a basis of F,x over F2. Given a basis {e}E, of Fx, every element a € Fx has 
a unique expansion 4 = EE aje; and can be represented as the k-bit string corresponding to x(a) € Fs. 
Note that we omitted the subscript 2 of x as it maps to the linear space over the prime subfield F2. Thus the 
binary representation of a € Fx is defined as the natural binary representation of x(a) € IFS (which in turn 
is the Fy-representation of a). Throughout the paper we freely associate between the binary representation 
of a field element a € F and its F2-representation, although—technically speaking—these are distinct 
objects. 

Given the representations x(a),«(b) of a,b € Fx, to compute the binary representation of a + b it 
suffices to compute the addition bit-wise, modulo 2. Computing the multiplication of elements a, b requires 
the specification of the multiplication tables {Ke, € Mx(F2) an for the basis {e;}. Given representations 
x(a) = (a;)*_,, «(b) = (b;)k_, for a,b € Fx respectively, the representation «(ab) of the product ab is 


computed as 
k 


k 
K(ab) = D AiK Z) ai(Kea x(b)). (11) 


i=1 i=1 


32 


Thus, using our representation for field elements, efficiently performing finite field arithmetic in F,. reduces 
to having access to the multiplication table of some basis of F, over Fo. 

The following fact provides an efficient deterministic algorithm for computing a self-dual normal basis 
for F, over Fz and the corresponding multiplication tables for any odd k. 


Lemma 3.16. There exists a deterministic algorithm that given an odd integer k > 0, outputs a self-dual 
normal basis of Fx over Fz and the multiplication tables of the basis in poly(k) time. 


Proof. The algorithm of Shoup [Sho90, Theorem 3.2] shows that for prime p, an irreducible polynomial in 
F,[X] of degree k can be computed in time poly (p, k). Then, the algorithm of Lenstra [LJ91, Theorem 1.1] 
shows that given such an irreducible polynomial, the multiplication table of a normal basis of F, over Fp 
can be computed in poly (k, log p) time. Finally, the algorithm of Wang [Wan89] shows that for odd k and 
a multiplication table K of a normal basis of Fx over F2, a multiplication table K’ for a self-dual normal 
basis of F, over F can be computed in poly(k) time. Putting these three algorithms together yields the 
claimed statement. O 


For g = 2* for k odd (i.e. q is an admissible field size) we use the shorthand tr for trj_,2. 


Lemma 3.17. Let k be an odd integer and {e;}*_, be a self-dual normal basis of Fx over F2. Then 
tr(e;) = 1 for all i, and furthermore the representation x(1) of the unit 1 € F is the all ones vector in F$. 


Proof. Since {e;} is a normal basis, e; = a” for some element a € Fx. Furthermore, for every element 
b € Fx, we have that tr(b*) = tr(b). This is because 


k-1 i+1 kal i 
tr(b) =} b =} ja), 
i=0 i=0 


where we use that b? = b for all b € F. Since e;,1 = €f, we get that tr(e;) = tr(e;) for all i, j. It cannot 
be the case that tr(e;) = 0 for all i. Suppose that this were the case. This would imply that tr(b) = 0 for all 
b € F. But then for all j € {1,...,k} and for some b Æ 0, we would also have by item 1 of Lemma 3.13 
that b; = tr(be;) = 0 where b = Xj bje; with b; € F2. This implies that b is the all zero element of Fx, 
which is a contradiction. Thus tr(e;) = 1 for alli = 1,2,...,k. 

The “furthermore” part follows item 1 of Lemma 3.13: 


K(1) = (tr(1-e1),..., tr(1- e)) = (1,...,1). 
o 


Lemma 3.18. For any odd integer k, let {e; Ki denote the self-dual normal basis of Fx over Fz that is 
returned by the algorithm specified in Lemma 3.16 on input k. Then the following can be computed in time 
poly(k) on input k: 


1. The representation x(a + b) of the sum a + b given the representations x(a) and x(b) of a,b € Fx. 
2. The representation x(ab) of the product ab given the representations x(a) and x(b) of a,b € Fx. 
3. The multiplication table Kı € Mx(F2) given the representation x(a) ofa € Fx. 


4. The representation x(a~') of the multiplicative inverse of a € F, given the representation x(a). 


33 


5. The trace tr(a) given the multiplication table K, of a € Fx. 


Furthermore, for all integers n, the representations of projections K(x°) and x(x!) of x € Fsk for com- 
plementary subspaces S,T of F, can be computed in poly(k,n) time, given the representations K(x), 
{«(01),K(02),-++,K(Um)} and {k(w1),K(W2),...,K(Wn—m)} where {vi} and {w;} are bases for S and 
T respectively. 


Proof. Given an odd integer k as input, by Lemma 3.16 it is possible to compute the self-dual normal basis 
{e; K together with the multiplication tables Ke, fori = 1,2,...,k. Addition is performed component- 
wise, and multiplication is done using Eq. (11). For the multiplication table K, it suffices to compute the k 
products «(ae;) fori € {1,...,k}. To compute inverses, observe that x(1) = x(aa~!) = K,x(a~'). The 
matrices K; are invertible over Fz, so therefore x(a~!) = K7 'x(1); moreover, x(1) is the all-ones vector 
by Lemma 3.17 and can thus be efficiently computed. Inverting the matrix can be done in poly(k) time via 
Gaussian elimination. The trace of an element a € Fx is by definition the trace of the multiplication table 
Ka. 

For the “Furthermore” part, we observe that since {01, V2, . . . , Um } U {W1, W2,...,Wn—m} forms a basis 
for Fz, there is a unique way to write x as a Fẹ linear combination of {v;} and {w;}. Via Gaussian 
elimination over Fx, the F2-representation of the coefficients of this linear combination can be computed in 
poly(n,k) time. Here we use that addition, multiplication and division over F, can be performed in time 
poly(k) using the previous items of the Lemma. O 


Remark 3.19. Throughout this paper, whenever we refer to Turing machines that perform computations 
with elements of a field F} for an admissible field size q = 2K, we mean that that the Turing machines are 
representing elements of F4 as vectors in {0, 1%} using the basis specified by the algorithm of Lemma 3.16 
and performing arithmetic as described in Lemma 3.18. 


3.4 Polynomials and the low-degree code 


In this section, we introduce some basic definitions about polynomials over finite fields. An m-variate 
polynomial f over F; is a function of the form 
f(X,- Xm) = D Ca Xg ee im 
we {0,1,...,.g—1}™ 


where {cx} is a collection of coefficients in F}. We say that f has individual degree (at most) d if Ca # 0 
only if a; < d for all 1 < i < m, and that it has total degree (at most) d if cy Æ 0 only if 7; a; < d. (Affine) 
multilinear polynomials are polynomials with individual degree 1. 

Low-degree polynomials play an important role in this paper. For one, the classical and quantum low- 
degree tests of Section 7 are nonlocal games that efficiently certify that the players’ answers are consistent 
with low-degree polynomials. Low-degree polynomials are also crucial in the probabilistically checkable 
proofs (PCP) construction that are used in “answer reduction” transformation (Section 10). 

We recall the Schwartz-Zippel lemma: 


Lemma 3.20 (Schwartz-Zippel lemma [Sch80, Zip79]). Let f,g : Fy — IF, be two unequal polynomials 
with total degree at most d. Then 


Pr [f(x) =s(x)] <4/q. 


x~ FM 


34 


The low-degree code. The Schwartz-Zippel lemma implies that the set of low-degree polynomials form 
an error-correcting code with good distance, which we call the low-degree code.” 

Fix an integer m € N and let M = 2”. For every y € {0,1}” define the following m-variate 
multilinear polynomial over F4: 


indiny(x) = Il Xi: Il (1 —x;). 
i:y;=1 i:y;=0 
Here, we identify {0,1} as a subset of F}. Notice that, when restricted to the subcube {0, 1}”, indy is 
zero everywhere except for when x = y. Fora € ES, label the coordinates of a as ay for y € {0,1}” 
(identifying the latter set with {1, . . ., M} using, say, the lexicographic ordering on strings). For any such a 
define the multilinear polynomial g4 : Fy — F}, called the low-degree encoding of a, as follows: 


x)= > Ay ` ind in (x) . (12) 
yE{0,1}”" 


Note that for any y € {0,1}”, ga(y) = ay. Furthermore, the map a ++ gj is linear: for every x € F;', 


Sa(x) = a - indm(x) , (13) 
where ind (x) is the vector (indm,y(x))ye{0,1}" € FM, 


Lemma 3.21. Let q be an admissible field size, and let M = 2” for some integer m. The complexity of 
computing the evaluation of the low-degree encoding gq(x), given a € F7 and x € FF represented in 
binary, is poly(M, log q). 


Proof. Computing gq(x) requires computing the sum of products ay - indm, y(x) over all y € {0,1}. Via 
Lemma 3.18, evaluating indm,(x) takes time poly(m, log q) because it requires performing m multiplica- 
tions of F} elements, and therefore the product ay - indm,y(x) requires poly(m, log q) time. Computing the 
sum over all y € {0,1} requires M - poly(m, log q) = poly(M, log q) time, as claimed. O 


Finally, for any H C F} we define the decoding map Decy(-) which takes as input a polynomial 
g : Fj — F, and returns the following vector a € FM: for all y € {0,1}”, if g(y) € H, then set 


ay = g(y), and otherwise set ay = 0. Note that for all a € H™ we have Decy (ga) = a where g; is the 
low-degree encoding of a. For notational brevity we often write Dec(-) to denote the boolean decoding map 


Decro14 (-). 
3.5 Linear spaces and registers 


For a set V, we write CY for the complex vector space of dimension |V|. The space CY is endowed with a 
canonical orthonormal basis {|x)},cv. By “a quantum state on V” we mean a unit vector 


lw)v ec’. 


IfV = phi V; is the direct sum of subspaces V; over F, then CY can be identified with Q Cc", by 
the identification of |x; ® --- ®© xy) with |x;) @--+@ |x,). As a special case, if {e;} is a basis of V the 
decomposition V = @‘_, (Fe;) yields the tensor product decomposition CY = Q% CFI. We sometimes 
refer to the spaces CIFl as the “qudits” of CY (or of a state on it). 


221t is also known as the generalized Reed-Muller code in the coding theory literature. 


35 


Definition 3.22. For a linear space V over a finite field IF, define the EPR state on CY @ CY by 


|EPR) y 


-mge 


We also write |EPR)p, as |EPR,) and |EPR2) as |EPR). 


3.6 Measurements and observables 


Quantum measurements are modeled as positive operator-valued measures (POVMs). A POVM consists 
of a set of positive semidefinite operators {M,},¢s indexed by outcomes a € S that satisfy the condition 
va Ma = I. If the latter condition is relaxed to )}, M; < I then we refer to {M,} as a sub-measurement. We 
sometimes use the same letter M to refer to the collection of operators defining the POVM. The probability 
that the measurement returns outcome a on state |) is given by 


Pr(a) = (|Malt) . 


A POVM M = {M,} is said to be projective if each operator M, is a projector (M2 = Ma). This automat- 
ically implies that operators M,, Mp for distinct outcomes a Æ b are orthogonal (MaM, = MM, = 0). 
An observable is a unitary matrix. A binary observable is an observable O such that O? = I, i.e. O has 
eigenvalues in {—1,1}. 

We follow the convention that subscripts of the measurement index the outcome and superscripts of 
the measurement are used to index different measurements. For example, we use {Mz p} to represent a 
measurement indexed by x whose outcome consists of two parts a and b. In this case, by slightly abusing 
the notation, we use {M7} and {M;} to denote 


= LM, Mi =D Mae 
a 


For any x, {M7} and {M7} are POVMs sometimes referred to as the “marginals” of {M7 ,}. 


Definition 3.23. Let {M*},<,4 be a family of POVMs indexed by x € X. Let f : A — B be an arbitrary 
function. We write {Mpo for the POVM derived from {M¥} by applying the function f before 
returning the outcome. More precisely, 


x — X 
fro = 2, Mi. 
a:f(a)=b 
If b is not in the image of f, then we define Mir.) =0) to be 0. In cases where the outcome a has a natural 


interpretation as a tuple (a1,...,a,) we sometimes slightly abuse notation and write Mi,=0 for some i € 


{1,...,k} to denote 


We will make frequent use of Naimark dilation. In our setting, it can be formulated as follows. For a 
proof, see [JNV~* 20, Theorem 5.1] 


Theorem 3.24 (Naimark dilation). Let |) be a state in Ha ® Hp. Let A = {Az} be a sub-measurement 
acting on Ha and B = {BY } be a sub-measurement acting on Hg. Then there exists 


36 


1. Hilbert spaces Ha, and Hg 


aux aux? 


2. a state |jaux) € Ha. ® Hp 


aux aux?’ 


3. and two measurements A = {A*} and B = {Bj} acting on Ha 8 Ha 
tively, 


and Hg ® Hg 


respec- 


aux aux?’ 


such that the following is true. If we write |») = |p) ® |aux), then for all x,y,a, b, 
(PAZ ® Byly) = (PAZ ® B19). 


In addition, 


aux) is a product state, meaning that we can write it as 
laux} = |aux,) ® |auxp), 


for |auxa) in Ha, and |auxg) in Hg 


aux aux" 


The second of these is the “orthogonalization lemma” from [KV11]. In the setting of symmetric strate- 
gies, it states the following. 
3.7 Generalized Pauli observables 


For prime number p, the generalized Pauli operators over F, are a collection of observables indexed by a 
basis setting X or Z and an element a or b of Fp, with eigenvalues that are p-th roots of unity. They are 


given by l 
oX(a)= Yo lj+a (j| and = Y oG, (14) 
jEFp jEFp 


where w = eT, and addition and multiplication are over F,. These observables obey the “twisted commu- 
tation” relations 
Va,b € Fp, o* (a) o” (b) = w™™ o7 (b) o* (a). (15) 


Similarly, over a field F} with q a power of p, we can consider a set of generalized Pauli operators, indexed 
by a basis setting X or Z and an element of F} and with eigenvalues that are p-th roots of unity. For 
a,b € F; they are given by 


T¥(a)= Yo lj+a (ji ani žb) = YO wo) (iI, 


jek, jek, 


where addition and multiplication are over F. For all W € {X,Z }, a,d' € F,, and b € Fp, powers of 
these observables obey the relations 


(c%(a)t%(a')) =t%(a+a') and (c(a))’ = t" (ab). 


In particular, since pa = 0 for any a € F, we get that that (t(a))? = I for any a € F}. The observables 
obey analogous “twisted commutation” relations to (15), 


Va,b € F}, TX (a) TŽ (b) =o) t7 (b) t¥ (a). (16) 


It is clear from the definition that all of the TX operators commute with each other, and similarly all the 
tT“ operators commute with each other. Thus, it is meaningful to speak of a common eigenbasis for all T* 


37 


operators, and a common eigenbasis for all T“ operators. The common eigenbasis for the T operators is 
the computational basis {|/) } jeF,- To map this basis to the common eigenbasis of the TÝ operators, one can 
apply the Fourier transform 


1 
F= co (4b) |g) (DI. (17) 
Vib 


Explicitly, the eigenbases consist of the vectors |ew) labeled by an element e € F} and W € {X, Z}, given 
by 


lex) = ayer lez) =le). 


We denote the POVM whose elements are projectors onto basis vectors of the eigenbasis associated with the 
observables T™ by {t,"}¢. Then for all W € {X, Z} and a € F}, the observables tT™ (a) can be written as 


Was ) ate. (18) 
beF, 
To invert this relation, we first quote the following useful Fourier fact. 


Lemma 3.25 (Fact 3.2 of [NW19]). Let V be a subspace of FF. For all v ¢ VŁ, 


E qytt(«-2) =0, 
u~V 


where the expectation is over a uniformly random vector u from V. 


We may now invert Equation (18): 


wT tlb) qW ay = ptt (a(b'—b)) W = av ; 19 
E, (a) Po, ai, wH = Th (19) 


where the second step follows from Lemma 3.25. 

For systems with many qudits, we will consider tensor products of the operators tT”. Slightly abusing 
notation, for W € {X,Z} and a € F; we denote by t (a) the tensor product tT” (a1) @... Q T” (an). 
These obey the twisted commutation relations 

aber, Tar =a ™") 22(b) rX(a), 
where a - b = Y7'_, aib; € Fy. For W € {X, Z} and e € F} define the eigenstates 


lew) = |(e&1)w) 8...8 |(en)w) , 


and associated rank-1 projectors tW. Analogous versions of Equation (18) and Equation (19) relate the 
multi-qubit observables and projectors: 


t” (a) = L gure , (20) 
beF; 
E wT b) W (a) = E Tia = ar F (21) 
ack; bic" ack; 


Since we only consider finite fields F} such that q = 2* the maximally entangled state |EPR,) and the 
corresponding qudit Pauli observables/projectors are isomorphic to a tensor product of maximally entangled 
states |EPR2) and qubit Pauli observables/projectors respectively; this is shown in the next lemma. This is 
used to argue that the Pauli basis test (described in Section 7.3) gives a self-test for Pauli observables and 
maximally entangled states over qubits. 


38 


Lemma 3.26. For all admissible field sizes q = 2* and integers L, there exists an isomorphism Pp: 
(CSL — (C2)®4 such that? 

P Q p|EPR,)°" = |EPR2)°* . (22) 
Moreover, for all W € {X,Z} and for all u € F? 


L k 
=# (QQ )t. (23) 

i=1 j=1 
Here, the (ujj);,; denotes a vector of Fz values such that u; = by Mie; for alli € {1,...,L} with 
{e1,..., eç} being the self-dual normal basis of Fy over F2 ve by Lemma 3.16. Fori € {1,...,L} 
and j € {1,...,q} the (i, j)-th factor ow, denotes the projector 5 5 (1 + (—1)"io%(1)) acting on the s-th 
qubit of |EPR2)®"*, where s = (i — 1)k E 
Proof. Define the isometry 0 : C7 — (C*)®* as 0 Ja) = |a1) Q |az) --+ |ax) where x(a) = (a1, a2,..., ak) € 
Fk is the bijection introduced in Section 3.3 corresponding to the basis {e1,...,eķ}. Observe that 6 & 
O|EPR.«) = |EPR2)®*. 

Let a € F4, and let x(a) = (a,...,a¢) € FS. Then from (21), we get 


TW = E (=e? t” (b) 


beF, 

ED (b) a 
= 1)% ab j] b; 

a T "(E s) , 


where b = J; bjej; since the basis {e;} is self-dual, we have that b; = tr(be;). From (24) and the product 
relation in (3.7) we get that 


(25) 


where we have the used the fact that the expectation is over iid b1,...,b, to take it inside the product in 
the second line. Next we claim that for all c € Fz, we have t" (cej) = 0ta™/(c)@ where owi (c) = I if 
c = 0, and otherwise is the Pauli W observable acting on the j-th qubit of (C*)®*. This can be verified by 
comparing the actions of both operators on the basis states of C^. 

Thus we obtain that the right-hand side of (25) is equal to 


k k 
— t W,j — øt 14b; Wp, 
De | [o'ene] =e S (E 1) o” (bj) 4 (26) 
k 
=o ( © ot) 0. (27) 
j=l 


3Here we have applied the natural identifications between (C7)®4 @ (C1)®4 and (C7 @ C1)®", and between (C7)®M* @ 
(C2) @Lk and (C2 Q C2)SLk, 


39 


Define ¢ = 6°". It is evident that (22) holds for this choice of Q, and it remains to show (23). The projector 


TÙ can be decomposed as the tensor product Qh, Tw where q acts on the i-th factor of (C1)®}. Express 


each u; as Lj ujje; where ujj E F2. Then from Equation (27) we get that 


L L k 
tW = Q aM = AKOA) Gs Q, (28) 
iZi 


i=1 j=1 


which establishes Equation (23). 0 


40 


4 Conditionally Linear Functions, Distributions, and Samplers 


4.1 Conditionally linear functions and distributions 


We first introduce conditionally linear functions, which are used to specify the question distribution for 
games considered in the paper in a way that the question distribution can be “introspected’”, as described 
in Section 8. Intuitively, a conditionally linear function takes as input an element x € V = EF” for some 
n = 0, and applies linear maps L; sequentially on x” where Vj, V2,... are a sequence of complementary 
register subspaces such that both the linear maps L; and the subspace V; for j > 2 may depend on the values 
taken by previous linear maps L1(x”), La(x%), ete. 

In the remainder of the section we use V to denote the linear space F” for some integer n > 0. For 
ease of notation we extensively use the subscript range notation. For example, if V1, V2, ..., Vo are fixed 
subspaces of V and k € {1,2,...,€} we write 


V= @® Vj, W= @ Vy, 


j:1<j<k j:l>j>k 


and it is understood that V, and V5; are identical to V-,, and V,g—1, respectively. Moreover, if V’ is a 
register subspace of V, F : V! — V’ is a function, and x € V, we write x" to denote F (xv). For example, 
in the following definition x" is used as shorthand notation for L4 (x""). 


Definition 4.1. Let V be F” for some n > 0. For all integers £ > 0 the collection of ¢-level conditionally 
linear functions (implicitly, on V) is defined inductively as follows. 


1. There is a single 0-level conditionally linear function, which is the O function on V. 


2. Let £ > 1 and suppose the collection of (¢ — 1)-level conditionally linear functions has been defined. 
The collection of ¢-level conditionally linear functions on V consists of all functions L on V that can 
be expressed in the following form. There exist complementary register subspaces V; and V5, of V, a 
linear function L4 on Vj, and for all v € Lı (V1), an (€ — 1)-level conditionally linear function L>1,v 
on V1, such that for all x € V, 


L(x) = xH + L-1, xl (x1) : 


The concept of conditionally linear functions is simple even though the notations seem complicated. It 
is best understood using an operational definition illustrated in Fig. 1. We give an example of a 2-level CL 
function in Example 4.3. 


Remark 4.2. It follows from the above definition that all 1-level CL functions are linear. 


Example 4.3. Consider V = F3. Then the following map 
L : (xo, x1, X2) + (0, xox2 + x1x2 + xo, x2) 


on V is a 2-level CL function by choosing V; = span{(0,0,1)}, Ly the identity function on Vj, V2 = ve 
and 
Lp, x, : (Xo, x1,0) = (0, xox2 + X1X2 + X0,0), 


which is linear for all x2 € F2. Note that intuitively the function L selects either xg or xı in the second 
coordinate conditioning on the value of x2. 


41 


M1 xV>1 
xl 
T 
I 
I 
: --->| xV2 xV>2 
I 
I 
Ly xl 
| 
i xhe 
| T 
Danag ai ed aa ys ENR Hae 


Figure 1: An illustration of an /-level CL function L. First a register subspace V; and a linear map L; are 
chosen and applied to obtain x41 = L4 (x1) € Vj. Then depending the value of x", a register subspace Vz 
and a linear map L, „4, are chosen and applied to obtain xe = L, ts (x2) € Vz and so on. Finally, L(x) 


is defiend to be Zf x/*. 


Remark 4.4. Note that for any integer L > 1 the collection of €-level CL functions trivially contains the 
collection of (£ — 1)-level CL functions: for this it suffices to note that the O function, which is a O-level 
CL function, is also a 1-level CL function by setting V; = V, Vs, = {0}, Li(x) = 0 for all x € V, and 


Ly yt, is the O map for all x € V. 


Definition 4.5. Let L,R : V — V be conditionally linear functions. The conditionally linear distribution 
UL,R corresponding to (L, R) is defined as the distribution over pairs (L(x), R(x)) € V x V for x drawn 
uniformly at random from V. 


Throughout the paper we abbreviate “conditionally linear functions” and “conditionally linear distribu- 
tions” as CL functions and CL distributions, respectively. 

The following lemma elucidates structural properties of ¢-level CL functions. Recall that using our 
shorthand notation, x/<' and x! in the lemma denote L(x) and Ly u(x% ) where u = Lex(x). 


Lemma 4.6. Let £ > 1 and V = F” for some integer n > 0. A function L : V — V is an €-level CL 
function if and only if the following collection of functions and subspaces exists: 


(i) For eachk € {1,2,...,€}, a function Lz, : V — V called the k-th marginal of L; 


(ii) For eachk € {1,2,...,0} andu € Le,(V), a register subspace V; „ of V called the k-th factor 
space with prefix u; 


(iii) For eachk € {1,2,...,£} and u € Le,(V), a linear map Ly y : Vk u > Ve, u called the k-th linear 
map of L with prefix u; 


such that the following conditions hold for all k € {1,2,...,€}. 


1. Le, is ak-level CL function on V; 


42 


2. V= l V; xt<i for all x € V; 


3. L(x) = L$ x! for all x € V, where L; is shorthand notation for L, 


1,X Lgi’ 
As in Item 3, we sometimes use V; and Ly to denote V; „ and Ly, respectively, leaving the prefix u implicit. 


Proof. We first prove the “if” direction: if there exist spaces and functions satisfying the conditions in the 
lemma, the fact that L is an -level CL function follows from Items 1 and 4 of the lemma statement. 

We now prove the “only if” direction. Given a CL function L on V, we construct the k-th family of 
subspaces and functions for all k € {1,...,¢} by induction on the level £. First consider the base case 
£ = 1. Since Le, = 0, we omit the mentioning of the prefix u € L<1(V). Define L< = L, the factor 
space V; = V, and the linear map Lı = L. It is straightforward to verify that the conditions in the lemma 
hold for these choices of linear maps and spaces. 

Now, assume that the lemma holds for CL functions of level at most £ — 1, and we prove the lemma for 
£-level CL functions. By definition, an ¢-level CL function L can be written as 


La) = xl + ee (x) 
for some linear map Lı : Vı —> V, and a family of (£ — 1)-level CL functions 
{L>1,v : V1 > Valeen 


where V; and V, are complementary register subspaces of V. Next, using the a hypothesis on the 
(¢ — 1)-level CL function L, , we get that for all v € Lı (V1) and all k € {1,2,. — 1} there exist k-th 
marginal functions L, <p: V>1 > V>1, k-th factor spaces v eu and k-th heart map T ak of Ls1,> with 
prefix u € L, <k(V>1) such that the conditions of the emai. for L~1,v hold. 

Define the marginal functions L< : V — V, factor spaces V; ,, and linear maps Lẹ, ,, for L as follows. 


(i) Define L<1 = L and the first factor space to be Vj; 
(ii) For all k € {2,3,..., £}, define 
Lep: x x4 +L! ine (x) forxeV; (29) 
(iii) For all k € {2,3,...,4} and u € Lek(V), define Vp u = V} p1 w and Lku = Lh, 1w where 
v =u” €1,(Vj) and w = u> € L! Leen Val) 


We verify that the conditions of the lemma are satisfied. Since L/, „<k İS by assumption a (k — 1)-level CL 
function on V,1, we get that L<, is a k-level CL function on V from Eq. (29), establishing Item 1. By the 
induction hypothesis, we have for all v € Lı(Vı) and y € Vs1, 


Vs1 = D Vp bap oes (30) 


which implies that for all x € V and v = x, 


V= vie (P Vista) = V10 (BVa) = Vars: 
j=2 


43 


The first equality follows from Eq. (30) while the second and third equalities follow from the definition of 
Vk, u- This establishes Item 2. 
Next, we have that for all x € V, v = x, and k € {1,2,...,¢}, 


Lep(x) = 0 +L, (2) (31) 
k-1 k 

= vV + D xh, = D xti, (32) 
i=1 i=1 


where L} ; is the i-th linear map of L>1,y with prefix Ly, <; 


Lata), The second equality follows from the inductive hypothesis applied to L 
follows from the definition of L;. Line (32) implies Item 3 of the lemma. 
Finally, Item 4 follows from (31) where we set k = @ and observe that L 


(x) and L; is the i-th linear map of L with prefix 
‘_, and the third equality 


v, <k 


i 
1, <p_1 ÍS equal to L-4 yl 


by the inductive hypothesis. This shows that L<;, Vk, u, and L; ,, satisfy the conditions of the lemma and 
completes the induction. O 


We note that the marginal functions, factor spaces, and linear maps of a given CL function L may not be 
unique; for example, consider the identity function on a linear space V = F”. This is clearly a 1-level CL 
function, but it can also be viewed as a k-level CL function for k € {2,...,n} with an arbitrary partition of 
V into factor spaces. 


Lemma 4.7. Let ¢,k > 0 be integers and U = F”, V = F” be linear spaces. Suppose L is a k-level CL 
function on U and R, is an ¢-level CL function on V for each u € L(U). Then the concatenation T of L 
and {Ry} defined as 

T(x) = Loe) + Riu (x) 


is a (k + €)-level conditionally linear function on U © V. 


Proof. We prove the claim by induction on k. The case k = 0 follows from the Definition 4.1. Assume 
that the lemma holds for L being at most (k — 1)-level. By Definition 4.1, there are complementary register 
subspaces U; and U,, of U, a linear function L; on Uy, and a family of (k — 1)-level CL functions Ly, , 
for v € L;(U;) such that 

L(xY) = xH + L-i, xh (x1). 


For all x"1, define the function T. 


Siu on U>1 È Vas 


Toa ht uae = L-1, ht (xt) + Rt 4xt>1 Y 


the concatenation of L-4 y4 and {Rrau) xŁ>ı Where L-1 is the shorthand notation of L-4 y4. By the 
induction hypothesis, T., „4 is (k + £ — 1)-level conditionally linear. The lemma follows from Defini- 


tion 4.1. O 


Lemma 4.8 (Direct sums of CL functions). Let V,V,...,V" be register subspaces of V such that 
V = Pj= V9). Suppose that, for each j € {1,2,..., m}, LO) is an ¢;-level conditionally linear function 
on VČ), Then the direct sum L = P LO) is an L-level CL function over V for £ = max;{ lj}, where L is 
defined by 


for all x =D, x) € O VO. 


44 


Proof. It is easy to see that an -level CL function is also k-level conditionally linear for all k > £. Hence, 
it suffices to prove the claim where £; = £ for j = 1,2,...,m. 

We prove the theorem by an induction on 4. For £ = 1, the functions LO) are linear and the claim 
follows by the fact that the direct sum of linear maps is linear. 

Assume now the theorem holds for a linear functions of level at most / — 1 and LO) are 
£-level conditionally linear -o for j = 3 2,,...,m. By definition, LO) is the concatenation of condi- 


tionally linear functions LÜ ) on vi ) and {L8 Jo; on V. Q) of levels 1, and @ — 1 respectively. Furthermore, 


>1,0; 


_ 3 ra= E (o +19), ((e))), 


j=l 


where vj = LË ) (at a") By the induction hypothesis, 


Lila) = LDO), Lor ola”) =EL p (00 o 


j=1 j=1 


are 1- aa and (£ — 1)-level conditionally linear respectively for v = jvj Vi = DE vý ‘ ) and Voi = 


oe ) This proves that L is -level conditionally linear. O 


Lemma 4.9. For eachi € {1,...,m} let LO, RO : V® — VÒ be €;-level conditionally linear functions 
and let L,R : V — V be the direct sums L = @; Lj and R = Qj; Rj, respectively, as aoe in Lemma 4.8. 
Then the conditionally linear distribution u; g is the product distribution [Jj yra) ga over V x V. 


Proof. The distribution 4z g is the distribution over pairs (L(x), R(x)) where x is ssa uniformly from 
V x V. By Lemma 4.8, this is equivalent to the distribution over pairs ((L;(x”))™,, (Ri(x”'))",) where x 
is chosen uniformly at random from V. This distribution is exactly the product of the distributions 47) ai) 
fori = 1,2,...,m. O 


CL functions used in the paper are frequently defined over a “large” field F» (e.g., the CL functions 
used in the low degree tests of Section 7). However, the introspection protocol in Section 8 handles CL 
functions defined over F2. The following definition and lemma show that CL functions over prime power 
fields can be viewed as CL functions over the prime field via a “downsizing” operation. 

Definition 4.10 (Downsizing CL functions). Let V = F; be a linear space for a prime power q = p* for 
odd t. Let L : V — V be a function. Let x(-) denote the downsize map from Section 3.3 corresponding 
to the basis {e1,...,e;} of F} over F, specified by Lemma 3.16. In particular, « is linear over Fp, and by 
Lemma 3.14, the set x(V) is the linear space Fy‘. Define the downsized function L* : x(V) + x(V) by 
Paro. Lox; 

Lemma 4.11. Let V = Fz for a prime power q = p'. Let L : V + V be an ¢-level CL function over V 
for some integer £ > 0. Let L<j, Vj, 4, and Lj, denote the j-th marginal functions, factor spaces, and linear 
maps corresponding to L as guaranteed by Lemma 4.6. Then L* : x(V) — x(V) is an €-level CL function 
on V" = x(V) = Ey with marginal functions Le p factor spaces Vio and linear maps Li, that satisfy the 
following for all į € {1,...,€}. 


1. The j-th marginal function Le; of L" is equal to xo L<j © K1 


45 


2. Forallu € Le(V), the j-th factor space Vi eu) and the j-th linear map Li klu) of L" are equal to 
K(V; u) and xK o Lj x(u) 0K + respectively. 


Proof. We prove the lemma by induction on £. Let L : V — V be an é-level CL function. For the base case 
£ = 1, observe that since x is a linear bijection between F} and EF, as linear spaces over F,, the function 
L* is linear, and thus a 1-level CL function over x(V) = Ey Furthermore, the first marginal function 
Lk, = L* = ço Ley ox‘; the factor space Vi = x(V1) = x(V), and L¥ = L = xo Ly ox. 

Assume that the statement of the lemma holds for some £ — 1 > 1. Let Lg, Vj, u, and Lj, denote 
the marginal functions, factor spaces, and linear maps corresponding to L as guaranteed by Lemma 4.6. 
Recursively define the following functions and spaces, for į € {1,...,}. 


K — , —1 
1. LE, =KoLlgjoK ; 


2. Forall u € L<;(V), set V“ = «(Vj u) and set L* a 


i K(u) =e Li x(u) eek 


We argue that {L£}, {Vj}, and {L¥ ,} satisfy the conditions of Lemma 4.6 for the function L“, which 
implies that L“ is an -level CL function over x(V). 
We first establish Item 4 of Lemma 4.6. Since L< = L, this implies 


-1 -1 
L" =xkoLor =xoL<or™ = Liy, 


as desired. Next, for all į € {1,2,..., 4}, for all y € x(V) with y = x(x) for some x € V, letting 
uj = L<j(x), we have 


L 4 L 
KV) = (B Va) = BV) =D Vols): 
j=l j j=1 


j=1 


The first equality follows from Item 2 of Lemma 4.6, the second equality follows from Lemma 3.14, and 
the third equality follows by definition. Since x(x™<;) = L£ j2 K(x) = LE iY): this establishes Item 2 of 
Lemma 4.6. 

Next, we have for all į € {1,2,...,@} and all y € x(V) with y = x(x) for some x € V, 


where v; = L% ,(y) = x(x'</). The first equality follows from definition of L“ j the second equality follows 
from y = x(x), the third equality follows from Item 3 of Lemma 4.6 applied to L< j» the fourth equality 
follows from the definition of the linear map Li and the fifth equality follows from Lemma 3.14. This 
establishes Item 3 of Lemma 4.6 for L® .. 

Finally, since L<; is a j-level CL function over V, using the inductive hypothesis we have that Le j is a 
j-level CL function over x(V) when j € {1,2,...,@—1}. It remains to establish that L£, is an -level CL 
function. Since L is an /-level CL function, there exists register subspaces V1, V1 such that V = V; @ Vy1, 
a linear map Lı : V; — V; and a collection of (£ — 1)-level CL functions {Ly1,, : V>1 > Vet} ven (V) 


46 


such that L(x) = xH + Ly, pu (xt) for all x € V. Observe that L* is a 1-level CL function on Vi = 
x(V,), and for v’ = x(v) € L}(V{), the inductive hypothesis implies the function L$ 4 „ is an (£ — 1)-level 
CL function on VE, = x(V>1). Furthermore, since L% , = L", we have that for all y € x(V) with y = x(x) 


for some x € V, 
Sey) = Ly) = Lily) + Sy ey) 


which implies that L£, is an -level CL function over x(V) = x(V1) ® x(V>1). This establishes Item 1 of 
Lemma 4.6, and completes the induction. O 


Lemma 4.12. Let V = F” for some integer n and prime power q = p* for odd t. Let L,R : V > V 
be CL functions. Let L",R® : k(V) —> x(V) be the associated downsized CL functions, as defined in 
Definition 4.10. Then the distribution rx gx over K(V) x x(V) defined in Definition 4.5 is identical to 
the distribution of (x,y) € x(V) x x(V) obtained by first sampling (x',y') according to yı, and then 
returning («(x'),«(y’)). 


Proof. The fact that L*, R® are well-defined CL functions follows from Lemma 4.11. The lemma is imme- 
diate from the definition of j/r« gx and the fact that « is a bijection. E 


4.2 Conditionally linear samplers 


Samplers are Turing machines that perform computations corresponding to CL functions defined in Sec- 
tion 4.1. The inputs and outputs of the sampler are binary strings that are interpreted as representing data 
of different types (integers, bits, vectors in F5, etc.). See Section 3.3.2 and in particular Remark 3.19 for an 
in-depth discussion of representing structured objects on a Turing machine. 


Definition 4.13. A function q : IN —> N is an admissible field size function if for all n € IN, q(n) is an 
admissible field size as defined in Definition 3.15. 


Definition 4.14 (Conditionally linear samplers). Let q : IN — IN be an admissible field size function, and 
lets : N — N bea function. A 6-input Turing machine S is a ¢-level conditionally linear sampler with 
field size q(n) and dimension s(n) if for all n € N, letting q = q(n) and s = s(n), there exist -level CL 
functions L&”, LB-” : F} — F with marginal functions {L27" } and factor spaces {V;","} for w € {A,B} 


satisfying the conditions of Lemma 4.6, such that for all w € {A,B}, j € {1,..., 4}, z € F3: 


On input (n, DIMENSION), the sampler S outputs the dimension s(n). 


On input (n, w, MARGINAL, j, Z), the sampler S outputs the binary representation of Le; (z), 


On input (n, w, LINEAR, j, u, y), the sampler S outputs the binary representation of a (y), where u 


a w,n 
is interpreted as an element of VZ j> 


On input (n, w, FACTOR, j, u), the sampler S outputs the j-th factor space Va of L™” with prefix 
uE LZ" (V), represented as a vector in {0,1}° indicating which elementary basis vectors of IF; span 
the factor space. 

We call Fe the ambient space of S on index n. We call the CL functions L™” for w € {A,B} the CL 
functions of S on index n. The time complexity of S, denoted as TIMEs (n), is the number of steps before 
S halts for index n. 


47 


Remark 4.15. Conditionally linear samplers are defined to have 6-input tapes, but depending on the input, 
not all input tapes are read. For example, if the second input tape has the input DIMENSION, then the re- 
maining input tapes are ignored. Thus for notational convenience we write samplers with different numbers 
of arguments, depending on the type of argument it gets. The number of arguments is always at most 6, 
however. 


The following definition shows how samplers naturally correspond to conditionally linear distributions. 


Definition 4.16 (Distribution of a sampler). Let S be a sampler with field size q(n), dimension s(n). For 
each n € N, let L”, L5” denote the CL functions of S on index n. Let jig, denote the CL distribution 
HLAn 18m Corresponding to (L“", LB”), as defined in Definition 4.5. We call jig,, the distribution of 
sampler S on index n. 


The following provides a definition of a “downsized” sampler that can be obtained from any sampler S 
over an admissible field F4. 


Definition 4.17 (Downsized sampler). Let q : IN — IN be an admissible field size function. Let S be an 
é-level sampler with field size q(n) and dimension s(n). Define x(S) as the following Turing machine. For 


aln € N, w € {A,B},j € {1,...,2}, and z € p381 where q = q(n) and s = s(n): 
e On input (n, DIMENSION), the sampler returns the output of S (n, DIMENSION ) multiplied by log q. 
e On input (n, wW, MARGINAL, j, Z), the sampler x(S) returns the output of S (n, w, MARGINAL, j, Z). 


e On input (n, w, LINEAR, j, u’, y), the sampler x(S) computes u such that u’ = x(u) and returns the 
output of S (n, w, LINEAR, j, u, y). 


e On input (n, w, FACTOR, j, u’), the sampler x(S) computes u such that u’ = x(u) and the indicator 
vector 
C = S(n,w, FACTOR, j,u) € {0,1}, 


and returns the expanded indicator vector (D1, D2,...,Ds) € ({0,1}'°84)*® where D; is the all ones 
vector in {0,1}!°84 if C; = 1 and D; is the all zeroes vector otherwise. 


The next lemma establishes that x(S) is a well-defined CL sampler, in the sense that it can be derived 
from a family of CL functions as in Definition 4.14. 


Lemma 4.18. Let £ > 1 be such that S is an ¢-level CL sampler, and let q(n) and s(n) be as in 
Definition 4.17. Then the Turing machine x(S) is an -level CL sampler with field size 2, dimension 
s'(n) = s(n) log q(n), and time complexity 


TIME,.5) (n) = O(TIMEgs(n) log q(n)) . 


Furthermore, for every integer n € IN the CL functions of x(S) on index n are (L*")* and (L®:")*, where 
LA”, LB” are the CL functions of S on index n. 


Proof. To show that x(S) is an ¢-level CL sampler we first show the “Furthermore” part, i.e. verify that 
for any integer n > 1 the CL functions (L“")* and (L®”)* are its associated CL functions on index n, as 
defined in Definition 4.14. 

Observe that for z € V, the binary representation of z as an element of {0,1}°!°84 passed as input to 
S is, by definition (see Section 3.3.2), identical to the binary representation of «(z). Using the definition 


48 


(L&") = Ko L™" ox! for w € {A,B} this justifies that x(S) returns the correct output when executed 
on inputs of the form (n, DIMENSION), (n, w, MARGINAL, j, Z) and (n, w, LINEAR, j, u’, y). 

Next, if T is a register subspace of FF; with indicator vector C € {0,1}5, then x(T) is a register subspace 
of F581 with indicator vector D defined from C as in Definition 4.17. Thus the output of x(S) on input 
(n, w, FACTOR, j, u’) is equal to the indicator vector of AA which is the j-th factor space of L™” with 
prefix u’ = x(u). 

The time complexity of x(S) is the same as with the sampler S, except it takes O(log q(n)) times 


longer to output the factor space indicator vectors. 
O 


49 


5 Nonlocal Games and MIP* 


We introduce definitions associated with nonlocal games and strategies that will be used throughout. 


5.1 Games and strategies 


Definition 5.1 (Two-player one-round games). A two-player one-round game © is specified by a tuple 
(X,Y, A, B, u, D) where 


1. X and y are finite sets (called the question alphabets), 

2. A and B are finite sets (called the answer alphabets), 

3. pis a probability distribution over ¥ x Y (called the question distribution), and 
4. D: xY x AxB -— {0,1} isa function (called the decision predicate). 


Definition 5.2 (Tensor product strategies). A tensor product strategy Z for a game 6 = (X,Y, A, B, u, D) 
is a tuple (|y), A, B) where 


e |p) is a pure quantum state, i.e. a unit vector in HA ® Hpg for finite dimensional complex Hilbert 
spaces H 4, Hp, 


e Ais a set {A*} such that for every x € X, AX = { A¥ Jac 4 is a POVM over H4, and 
° B is a set {BY} such that for every y € Y, BY = {B} }ycg is a POVM over Hp. 


Definition 5.3 (Tensor product value). The tensor product value of a tensor product strategy Z = (|p), A, B) 
with respect to a game 6 = (¥, V, A, B, u, D) is defined as 


val"(6,.%) = S(x,y) D(x,y,a,b) (pA © BY |g) 
x,y,a,b 
For v € [0,1] we say that the strategy .Y passes (or wins) & with probability v if val*(6,.%) > v. The 
tensor product value of © is defined as 


val* (6) = sup val" (6, 7), 
S 


where the supremum is taken over all tensor product strategies .Y for 6. 


Remark 5.4. Unless specified otherwise, all strategies considered in this paper are tensor product strate- 
gies, and we simply call them strategies. Similarly, we refer to val* (6) as the value of the game ©. 


Definition 5.5 (Projective strategies). We say that a strategy Z = (|W), A,B) is projective if all the mea- 
surements {A*}, and {B; }y are projective. 


Definition 5.6. A game 6 = (X,Y, A, B, u, D) is symmetric if the question and answer alphabets are 
the same for both players (i.e. X = Y and A = B), the distribution u is symmetric (i.e. y(x, y) = 
u(y, x)), and the decision predicate D treats both players symmetrically (i.e. for all x, y,a,b, D(x,y,a,b) = 
D(y,x,b,a)). 

We call a strategy Z = (|p), A, B) symmetric if |p) is a (pure) state in H ® H, for some Hilbert space 
H, that is invariant under permutation of the two factors, and the measurement operators of both players are 
identical. 


50 


We often specify symmetric games 6 and symmetric strategies .Y using a compact notation: we write 
6 = (X,A,p,D) and Z = (|p), M) where M denotes the set of measurement operators for both players. 


Lemma 5.7. Let 6 = (X, A, u, D) be a symmetric game such that val" (6) = 1 — e for some £ > 0. Then 
there exists a symmetric and projective strategy F = (|p), M) such that val* (6, F) > 1 — €. 


Proof. By definition for any e’ > e there exists a strategy .7’ = (|’), A,B) such that val*(6,.7’) > 
1 — æ’. Using Naimark’s theorem (Theorem 3.24) we can assume without loss of generality that |’) € 
Cå, ® C; for some integer d and that for every x, A* and B* are projective measurements. Let 


oil 


I) A 


where |y}) is obtained from |y} by permuting the two players’ registers. Observe that |y} is invariant under 
permutation of AA’ and BB’. 

For any question x € Æ define the measurement M* = {M*},<.4 acting on the Hilbert space C? & C! 
as follows: 


(10) a|1)B1’) ap + ally) E (C4 @ C4.) (C Ch), 


M; = |0X0| ® Az + |11| @ Bz, 
i.e. for |p) € C4, M*(|0)|~)) = |0) AX|@) and Mx(|1)|~)) = |1)B*|p). When Alice receives question 
x, she measures M* on registers AA’, and when Bob receives question y, he measures MY on registers 


BB’. Using that by assumption the decision predicate D for 6 is symmetric, it is not hard to verify that 
val* (6, Z) = val*(6,.7’). o 


Definition 5.8. Let 6 = (X, YV, A, B, u, D) be a game, and let Z = (|), A, B) be a strategy for 6 such 
that Ha = Hp. Let S C X x Y denote the support of the question distribution y, i.e. the set of (x,y) such 
that u(x, y) > 0. We say that Z is a commuting strategy for © if for all question pairs (x,y) € S, we have 
[Ax, Bj] = 0 for all a € A,b € B, where [A, B] = AB — BA denotes the commutator. 


Definition 5.9 (Consistent measurements). Let A be a finite set, let |p) € H & H be a state, and let 
{Ma }ac 4 be a projective measurement on H. We say that { Ma }ac.4 is consistent on |) if and only if 


Vac A, M8 glp) =I, 8 Maly). 
When the state |y) is clear from context we simply say that { Ma }ac 4 is consistent. 


The terminology “consistent” arises from the fact that when the measurement {M,} is performed on 
both registers of |y}, the probability of obtaining twice the same outcome is 


(PIM: 8 Maly) = } (yla @ (Ma)? |p) = } yla @ Maly) = 1, 


a a a 


where the first equality is by definition of consistency, the second uses that {M,} is projective, and the last 
uses )°, Ma = I. 


Definition 5.10 (Consistent strategies). Let .Y% = (|p), A,B) be a projective strategy with state |p) € 
H © H, for some Hilbert space H, which is defined on question alphabets V and Y and answer alphabets 
A and B, respectively. We say that the strategy Z is consistent if for all x € A’, the measurement { A* bac A 
is consistent on |Y) and if for all y € V, the measurement {B} Jpeg is consistent on |p). 


51 


We almost always restrict our attention to symmetric strategies. In this case, consistency implies that 
whenever both players are sent the same question, they provide the same answer. 


Definition 5.11. We say that a strategy .Y for a game 6 is PCC if it is projective, consistent, and commuting 
for 6. Additionally, we say that a PCC strategy .Y is SPCC if it is furthermore symmetric. 


Recall that for a (pure) state |) € Ha Q Hp, the Schmidt rank of |p) is the smallest integer k such that 


k 
|p) = } ailu) ® lvi), 
i=1 
for some orthonormal families {|u;)} C Ha and {|v;)} C Hg and a; > 0. The coefficients «; are uniquely 
defined by |) and called Schmidt coefficients. The state |Y) is called maximally entangled if a; = a for 
alli = 1,...,k. 


Definition 5.12 (Entanglement requirements of a game). For all games 6 and v € [0,1], let £ (6, v) denote 
the minimum integer d such that there exists a finite dimensional tensor product strategy .Y that achieves 
success probability at least v in the game 6 with a state |) whose Schmidt rank is at most d. If there is no 
finite dimensional strategy that achieves success probability v, then define £(6,v) to be œ. 


Remark 5.13. A strategy Z = (|p), A,B) for a symmetric game & = (X, X, A, A, u, D) is called 
synchronous if it holds that for every x € X anda # b € A, (| Aj Q Bi|p) = 0; in other words, the 
players never return different answers when simultaneously asked the same question. As shown in [PSS* 16] 
the condition for a finite-dimensional strategy of being synchronous is equivalent to the condition that it is 
projective, consistent, and moreover |p) is a maximally entangled state. (The equivalence is extended to 


infinite-dimensional strategies, as well as correlations induced by limits of finite-dimensional strategies, 
in [KPS18].) 


5.2 Distance measures 


We introduce several distance measures that are used throughout. 


Definition 5.14 (Distance between states). Let { |) }nen and {|Y} new be two families of states in the 
same space H. For some function ô : N — [0,1] we say that {|n} } and {|y/,)} are d-close, denoted as 
lp) æ |W’), if |||) — |, ||? = O(6(2)). (For convenience we generally leave the dependence of the 
states and 6 on the indexing parameter n implicit, writing e.g. |y} for |W,).) 


Definition 5.15 (Consistency between POVMs). Let ¥ be a finite set and y a distribution on ¥. Let 
lp) € Ha Q Hpg be a quantum state, and for all x € X, {A*} and {B*} POVMs. We write 


Aj ® Ip Xs Ia 8 B% 
on state |y} and distribution p if 


E )(plAr 8 Bily) = O(6) 


XNU ab 
In this case, we say that {A*} and {B*} are d-consistent on |). 
4Here the use of the O(-) notation refers to the fact that the notation will often be used to measure consistency for famillies of 


POVM and states indexed by an implicit parameter n € IN, which will always be clear from context but omitted for legibility, see 
Definition 5.14. The O(-) is taken as n — oo. 


52 


Note that a consistent measurement on |) according to Definition 5.9 is 0-consistent with itself on the 
same |) and under the singleton distribution, according to Definition 5.15 (and vice-versa). This is because 


J (plM, @ Malp) = 1 = Va, 18 Malp) = M: 8 Ily), 


a 
because using that {M,} is a POVM there must be equality in the Cauchy-Schwarz inequality 
1= (9M8 Maly) < (LS Malol?) (Z IM © 1)? | anr 
a 


Definition 5.16 (Distance between POVMs). Let ¥ be a finite set and y a distribution on X. Let |) € H 
be a quantum state, and for all x € X, {M2} and {N} } two POVMs on H. We say that {M2} and {N} } 
are 6-close on state |p) and under distribution u if 


x x 
ELM — Na)lp)|l’ < O68) , 
and we write M* ~s N* to denote this when the state |) and distribution 4 are clear from context. This 
distance is referred to as the state-dependent distance. 


aang 5.17 Distant: between strategies). Let 6 = (X,Y, A, B, u, D) be a nonlocal game and let 
= (4, A,B), Z’ = (’, A’, B’) be strategies for 6. For ô € [0,1] we say that Z is 5-close to Z” if the 
ae conditions hold. 


1. The states |), |’) are states in the same Hilbert space H4 ® Hpg and are d-close. 


2. For all x € X,y € V, we have A* ~s (A')¥ and BY œ> (B'){, with the approximations holding 
under the distribution y, and on either |) or |y’). 


We record several useful facts about the consistency measure and the state-dependent distance without 
proof. Readers are referred to Sections 4.4 and 4.5 in [NW19] for additional discussion and proofs. 


Fact 5.18 (Fact 4.13 and Fact 4.14 in [NW19]). For POVMs {A*} and {B*}, the following hold. 
1. If AX ® Ip ~s Ia Q BE then AX ® Ip =% Ia @ BE. 


2. If AX ® Ig 5 Ia @ Bx and { Ax} and { B*} are projective measurements, then Ax ® Ig ~5 Ia Q Bz. 


3. If A} ®@ Ip ~s Ia Q BX and either {Ai} or {Bx} is a projective measurement, then A% ® Ig ~ 51/2 
I, 8 Bz. 


Fact 5.19 (Fact 4.20 in [NW19]). Let A,B,C be finite sets, and let u be a distribution over question pairs 
(x,y). Let {Az} and {B*,} be operators whose outcomes range over the product set A x B. Sup- 


pose a set of operators Tlak whose outcomes range over the product set A x C, satisfies the condition 
EAC eC. < I for all a and all y. If Ab xs B* ap ON average over x sampled from the corresponding 


marginal of distribution u, then Ci At » 65 Cc cB ab on average over (x,y) sampled from p. 


53 


Proof. Fix questions x,y and answers a E A,b € B. We have then that 
E ||(CheAz, — ChB Ip) ||” = Livl( ab — Bap)’ (Che) (Cac) (Aze — Bis) lp) (33) 
c 


< (y|(Ažp— Bžp) (Ažp— BX.) |p) (34) 
= ||(A*, — Bt, )|p) || (35) 


where the inequality follows from the assumption that EN < I. Thus we obtain the desired 
conclusion 


E EGAn- ) |p) < Elk Dp? < 


(xy)~y a,b,c 
O 


Fact 5.20. Let X, A denote finite sets, and let G denote a set of functions g : X — A. Let {Az}, {Bz} 
be POVMs indexed by X and outcomes in A. Let {S¢ }xex,geg denote a set of operators such that for all 


xEX, gon) oe < I. If A% ~ B} on average over x, then SoA ax j 6 SoBe y 
Proof. We expand: 
EDA - Bey )IP)|| = EEUNA — Bey)" D SEA — Biop) 
8 


=EL -BD E auras 


i sg(x)=a 


SE )i(p|(Aa — Br)" (Aa — Bz) |p) 
=E) |I(Ar— Bayly). 


The inequality follows from the fact that Fonti lS) oe < Does) < I. The last line is at most ô 
by assumption, and we obtain the desired conclusion. O 


Lemma 5.21. Let X and A be finite sets and u a R on X. Let {Ax yac 4 be a projective measure- 
ment and let { BX yac 4 be a set of matrices. If AX ~s BX, then for all subsets S C A, we have 
Y AX axs T 
aes acs 
Proof. We expand: 
2 2 


E 


x 


L (As = A3- Bs) |) 


acs 


EX Ar (41-82) Iv) 
Lg E (AL BOANA BDY) 
< ED IICA- BDY)? 


where all expectations are taken according to u. In the second line we used the projectivity of {A*}, and in 
the third line we used that A? < I. O 


54 


Fact 5.22 (Triangle inequality, Fact 4.28 in [NW19]). If Aj ~s B% and BX %e C7, then AX 546 C). 


Fact 5.23 (Triangle inequality for “~”, Proposition 4.29 in [JINV~ 20]). Jf A¥ Q Ip ~: Ia Q BŽ, CX Q Ig 
In ® Bž, and C% ® Ip Xy I, ® De. then AX ® Ip Xe42/0F7 I, ® D7. 


L 


Fact 5.24 (Data processing, Fact 4.26 in [NW19]). Suppose Aj ® Ig ~s Ia ® BY. Then ATO) ® Ig ~ 
Ia 8 Bie(.)=o} 


The state-dependent distance is the right tool for reasoning about the closeness of measurement operators 
in a strategy. The following lemma ensures that, when two families of measurements are close on a state, 
changing from one family of measurement to the other only introduces a small error to the value of the 
strategy. 


Lemma 5.25. Let {A* ,}, {B* , .}, {Că o} be POVMs. Suppose {B* „ .} is projective, and 


a,b,c a,b,c 
Azv D Ig S5 Ia @Bi,, 
Ce ® Ig Xs Ia & Bie , 


where recall that B3 p =} Bry ¿ and similarly By. = Lp Boies Then the following approximate commu- 
tation relation holds: 


[Az v Cac] ® Is 5 0. 
Proof. Applying Fact 5.19 to Cy, ® Ip ~s Ia & BY, and {A* , ® Ip}, we have 
Ai oCa,c @ Ip ~ Azp Q Bic. (36) 


Similarly, applying Fact 5.19 to A% , @ Ip ~; Ia Q B* „and {I, Q Bře}, and using the fact that {B7 , .} is 
projective, we have 


x Be ae x px 
Any ® Ba, å Ia © Ba a,b 


= 1, @Bi,.- (20 
Combining Equations (36) and (37), we have 
Az, bCa c ® Ip ~s Ia @ B3 pc (38) 
A similar argument gives 
Ci Any Q Ip S3 Ia @ B pc (39) 
The claim follows from Equations (38) and (39). O 


The following lemma is a slightly modified version of [NW19, Fact 4.34]. 


Lemma 5.26. Let k > 0 be an integer and let € > 0. Let X be a finite set and y a distribution over &. 
For each 1 < i < k let G; be a finite set of functions gj : Y —> Ri and for each x € X let ET be 
a projective measurement. Suppose that for alli € {1,...,k}, Gi satisfies the following property: for any 
two gi # g; E Gi, the probability that gi(y) = gi (y) over a uniformly random y € Y is at most €. 

Let {Ae ee, be a projective measurement with outcomes (Q1, ..., 8K) E€ G1 X +++ X Gr. For each 


EL 82 
1 < i< k, suppose that on average over x ~ y and y € Y sampled uniformly at random, 


Bie ® Ip S5 In Q Gix a 


[evaly(-)=aj] ` 


55 


Define the POVM family {Cx gb for x € X, by 


81, 82s +07 
x — ckx... Cx 1x œ2x ckx 
Ci g2, Be _ Ge Gant Ge Gory Gy j 


Then on average over x ~ u and y € Y sampled uniformly at random, 


At eea] © Ig ~k(5-+e)1/2 la ® Clevaly(-)=(a1, 2, 2) ` (H) 


Proof. The proof is identical to the one given in [NW19, Fact 4.34], with the only modification needed to 
insert the dependence on x for all measurements considered. O 


Lemma 5.27 (Fact 4.35 in [NW19]). Let D be a distribution on (x, y1, y2) E€ X x Yy x Xəz. Fori € {1,2} 
let G; be a collection of functions g; : Y; —> R; and let {(G;) A geg; be families of measurements such that 
{(G2)ž}g is projective for every x. Suppose further that for every (x,y1) it holds that for go # 83 € G2 
the probability, on average over y2 chosen from D conditioned on (x, y1), that g2(y2) = 83(y2) is at most 
q. Let [An *} be a family of projective measurements with outcomes (a1,a2) € Ri X Rz such that for 
i € {1,2}, 


XY: ~ : 
As @ 15 18 (Gini) (42) 
and 
Ave DIS I8 Arar . (43) 


Define a family of measurements { Tet as 


Fix. = (Ca)ga( Gade (Caden > (44) 
Then there is a 
pasting = Spasting(,5) = poly (1,8) (45) 
such that 
X Y1, bes 
Aga Ql —6 pasting IQ lid, Game ; (46) 


Lemma 5.28 (Fact 4.32 of [NW19]). Let 6 be a nonlocal game, and let F, Z” be two strategies that are 
6-close (in the sense of Definition 5.17) for 5 € [0,1], and use the same state |Y). If either Z or F” is 
projective, then |val* (6, Z) — val*(6,.%')| < O(61/2). 


5.3 The class MIP* 


The complexity class MIP* of multi-prover interactive proof systems with entangled provers is the class 
of languages that can be decided by a proof system in which a polynomial-time classical verifier interacts 
with noncommunicating, computationally unbounded provers who may share a finite-dimensional entangled 
state. In general there may be polynomially many provers, and the verifier may interact with them over 
polynomially many rounds. In each round, the verifier generates a question for each prover and receives a 
response from them. At the end of the interaction the verifier decides to either accept or reject. A language 
L is in MIP* if for every input z € L, there exists a strategy that the provers can use to convince the verifier 
to accept with high probability (> 2/3), and for every input z ¢ L, there is no strategy that the provers can 
use to convince the verifier to accept with more than low probability (> 1/3). The former property is called 
completeness of the proof system, and the latter soundness. The completeness and soundness probabilities 
2/3 and 1/3 may be amplified by sequentially repeating the protocol. 


56 


For a formal definition of the class MIP*, see e.g. [VW16, Section 6.1]. The main result of this paper 
is a lower bound, and for this purpose, it suffices to restrict our attention to proof systems which involve 
only two provers, one round of interaction, and completeness probability 1. This class is often denoted 
MIP; 1/2(2, 1). A formal definition for this restricted setting follows. 


Definition 5.29. A language L is in MIP} 4/2 if and only if there exist two randomized Turing machines S 
and D with the following properties. 


1. Efficiency: For every z € {0,1}* there is a game 6, = (&, YV, A, B, u, D) such that: 


(a) The Turing machine S given input z runs in time poly(|z|) and returns a pair (x,y) E€ ¥ x Y 
such that the distribution of (x,y), over the random choices of S, is y. 


(b) The Turing machine D given as input z and a tuple (x,y,a,b) E€ X x VY x A x B runs in time 
poly(|z|) and returns D(x, y,a,b).7° 


2. Completeness: If z € L, then val"(6,) = 1 
3. Soundness: If z ¢ L, then val"(6,) < 1/2. 


We say that the pair (S, D) form an MIP* verifier for the language L, and the associated family of games 
6, are an MIP* protocol for L. 


It is clear that MIP; 4 /2(2,1) © MIP* (the probability 1/2 of acceptance of inputs not in the language 
may be reduced to 1/3 by sequential repetition as noted above). The main result of this paper is that 
REC MIP) 1 /2(2,1). Since it is known that MIP* C RE it follows that MIP3 4 /2(2,1) = MIP* = RE. 

To show our lower bound it will suffice to consider an even more restricted class of MIP* verifiers, for 
which the Turing machines S and D have a special structure which we refer to as normal form. This is 
defined in the following subsection. 


5.4 Normal form verifiers 


We introduce a normal form for verifiers in nonlocal games. The normal form uses Turing machines to 
specify the two actions performed by the verifier in a game: the generation of questions and the verification 
of answers. For the generation of questions, we use the formalism of samplers introduced in Section 4.2. 
The normal form for verifiers gives a uniform method to specify an infinite family of nonlocal games. 


Definition 5.30 (Decider). A decider is a 5-input Turing machine D that on all inputs of the form (n, x, y,a, b) 
where n is an integer and x,y,a,b € {0,1}*, D halts and returns a single bit. Let TIMEp(n) denote the 
maximum time complexity of D over all inputs of the form (n, x, y, a,b). When the decider D outputs 0 we 
say that it rejects, otherwise we say that it accepts. Furthermore, we call the input n to a decider the index. 


Definition 5.31. A normal form verifier is a pair V = (S, D) where S is a sampler with field size q(n) = 2 
and D is a decider. The description length of V is defined to be |V| = max{|S|,|D|}, the maximum of 
the description lengths of S and D. The number of levels of verifier V is defined to be the number of levels 
of its sampler S. 


25Note that the running time of D should be poly(|z|), even for long inputs a,b. This can be ensured by having D return 0 
whenever x, y,a, b are too long with respect to |z]. 


57 


Next we introduce a notion of bounded normal form verifiers, in which a single parameter (in this case, 
an integer À) specifies a bound on the time complexity of V, as well as on the description length of the 
verifier. 


Definition 5.32 (A-bounded verifiers). Let A € IN be an integer. A normal form verifier V = (S,D) is 
A-bounded if the following two conditions hold 


1. The time complexity bounds of the verifier TIMEs (n), TIMEp (n) are at most nô for n > 2.” 
2. The description length of the verifier |V| is bounded by A. 


Normal form verifiers specify an infinite family of nonlocal games indexed by natural numbers in the 
following way. 


Definition 5.33. Let V = (S, D) be a normal form verifier. For n € IN, we define the following nonlocal 
game V; to be the n-th game corresponding to the verifier V. The question sets V and Y are {0, 1} TIMES (n), 
The answer sets A and B are {0, 1}T'MEp(n), The question distribution is the distribution fs, specified in 
Definition 4.16. The decision predicate is the function computed by D(n, -,-,-,-), when the last four inputs 
are restricted to ¥ x VY x A x B. The value of the game is denoted by val“ (V„). 


We note that the game V, is well-defined since for a normal form verifier the distribution js, is sup- 
ported on {0,1} TMEs() x {0,1} T!MEs() and a normal form decider always halts with a single-bit output. 


Definition 5.34 (Verifier with commuting strategy). Let V = (S, D) be a normal form verifier. For v : 
IN > [0,1] say that V has a value-v commuting strategy if for all n € IN, the game V, has a value-v(1) 
commuting strategy. 


26We do not require the bound to hold for n = 1 as nô is always 1 and hence it is usually not satisfied. 


58 


6 Types 


We augment the definition of conditionally linear functions with a construct we call types. A type t is an 
element of a type set T, and a T -typed family of conditionally linear functions is a collection {Lt}te7 
containing a CL function Lẹ for each type t € 7. The utility of this definition is that it allows us to 
define another object, namely conditionally linear distributions parameterized by an undirected graph G = 
(7,E) on the set of types known as a type graph. Given two 7-typed families of conditionally linear 
functions {Ly }ye7,{Rv}ver, the (7, G)-typed conditionally linear distribution corresponding to them is 
the distribution which samples a pair of types (u,v) uniformly at random from the edges of G (with each 
endpoint having equal probability as being chosen for u or v, respectively) and then samples (x,y) from 
UL, R,- The output is the pair ((u,x),(v,y)). 

The normal form verifiers we present in the paper frequently use typed CL distributions to sample 
their questions, rather than untyped CL distributions. Types allow us to model the parts of their question 
distributions which are unstructured and unsuitable for being sampled from CL distributions. A common 
use of types is to allow the verifier to use previously defined games as subroutines. Here, the type helps 
indicate which subroutine the verifier selects, and an edge in the type graph between two different types 
allows us to introduce a test that cross-checks the results of one subroutine with the results of another. 

Finally, we show how to convert any typed CL distribution into an equivalent (in the precise sense 
defined below) untyped CL distribution with two additional levels, a technique we call detyping. This 
entails showing how to “simulate” the graph distribution of G = (T , E), i.e. the uniform distribution on its 
edges, using an untyped CL distribution. The simulation we give is based on rejection sampling and is only 
approximate: its quality degrades exponentially with the number of types in 7. As a result, we will ensure 
throughout the paper that all type sets we consider are of a small, in fact generally constant, size. 

This section is organized as follows. In Section 6.1 we define typed variants of CL distributions, sam- 
plers, deciders, and verifiers. In Section 6.2 we define a CL distribution which samples from the graph 
distribution of a given graph G = (7,E). In Section 6.3 we define a canonical way to detype typed sam- 
plers, deciders, and verifiers using the graph sampler from Section 6.2. We then prove the main result of the 
section, Lemma 6.18, which relates the value of the detyped normal form verifier to the value of the original 
typed verifier. 


6.1 Typed samplers, deciders, and verifiers 


Definition 6.1 (Typed conditionally linear functions). Let 7 be a finite set and V be F” for some integer 
n > 0. A T -typed family of ¢-level conditionally linear functions (implicitly, on V) is a collection {Lt }ter 
such that, for each t € 7, Ly is an ¢-level conditionally linear function on V. 


Definition 6.2 (Graph distribution). Let G = (U,E) be an undirected graph with vertex set U and edge 
set E. Edges in E are written as multisets {u,v} of two vertices; the case u = v represents a self-loop. 
Suppose there are m edges, k of which are self-loops. Then the graph distribution ug of G is the distribution 
over U x U such that for every (u,v) € U x U, 


1/(2m —k) if {u,v} €E, 


0 otherwise. 


uclu, v) = i 


This is identical to the uniform distribution over pairs (u,v) € U x U such that {u,v} € E. 


59 


Definition 6.3 (Typed conditionally linear distributions). Let 7 be a type set and L = {Lu}uer, R = 
{Ry }ver be T -typed families of conditionally linear functions on V. Let G = (T, E) be a graph with vertex 
set T. The (7, G)-typed conditionally linear distribution ue g corresponding to (L, R) is the distribution 
over pairs ((u, x), (v,y)), where (u,v) is drawn from ug and (x,y) is drawn from 4L, R,- 


Definition 6.4 (Typed conditionally linear samplers). Let q : N — IN be an admissible field size function 
and s : N — N be a function. Let 7 be a finite type set and let G = (7,E) be a graph with vertex 
set T. A 7-input Turing machine S is a (7, G)-typed, ¢-level conditionally linear sampler with field size 
q(n) and dimension s(n) if for all n € N, letting q = q(n) and s = s(n), there exist 7 -typed families of 
é-level conditionally linear functions {L’"}ie7 and {LP "Jer on V = F} where t € T,w € {A,B}, 
w,n 


} and factor spaces {V,":" } satisfying 


the conditionally linear function L{”” has marginal functions {Lp j eis 


the conditions of Lemma 4.6, and for all t € T, w € {A,B}, j € {1,..., £}, andz € V: 


e On input (n, DIMENSION), the sampler S returns the dimension s(n). 


e On input (n, w, MARGINAL, j, Z, t), the sampler S returns the binary representation of Lee j (z): 


e On input (n, w, LINEAR, j, u, Y, t), the sampler S outputs the binary representation of Lgu (y), 


e On input (n, w, FACTOR, j, u, t), the sampler S returns the factor space Va of L?” , represented as 


an indicator vector in {0,1}. 


We call re the ambient space of S. We call {Lf "}, {LP} the CL functions of S on index n. The time 


complexity of S, denoted TIMEs (n), is the number of steps before S halts for index n. 


We assume that types t € T are represented using binary strings of length at most [log |7|]; if a type 
t is given as input to the sampler S and is not an element of 7, then the sampler returns 0. Furthermore, 
as described in Remark 4.15 for un-typed samplers, we write typed samplers with different numbers of 
arguments depending on the input. We note that Definition 6.4 includes the graph G as a parameter, even 
though it is not explicitly used anywhere; this is so we can define the following concept, i.e. the distribution 
of a typed sampler, which does depend on G. 


Definition 6.5 (Distribution of a typed sampler). Let S be a (7, G)-typed sampler. Let L” = {LY } for 
w € {A,B} be the CL functions of S on index n. The distribution of sampler S on index n, denoted 4$ „, 
is the (7, G)-typed conditionally linear distribution corresponding to (LÂ, LP). 


Definition 6.6 (Downsizing typed CL samplers). Let S be a typed sampler. The typed downsized sampler 
«(S) is defined as in Definition 4.17 with the only difference that the type t is included as part of the input 
to the sampler, as in Definition 6.4. 


Lemma 6.7. Let S be a (T , G)-typed ¢-level CL sampler, for some finite set T, graph G, and integer £ > 0. 
Let q(n) and s(n) be as in Definition 6.4. Then x(S) defined in Definition 6.6 is a (T , G)-typed ¢-level CL 
sampler with field size 2, dimension s(n) log q(n), and time complexity 


TIME,,5) (n) = O(TIMEs (n) log q(7)) . 


Furthermore, for every integer n > 1, the CL functions of x(S) on index n are {(LE")] weta B}ter as 
defined in Definition 4.10. 


60 


Proof. The proof is analogous to the proof of Lemma 4.18, and we omit it. O 


Definition 6.8 (Typed decider). A typed decider is a 7-input Turing machine D that on all inputs of the form 
(n,u,x,v,y,a,b) where n is an integer and u, x,v,y,a,b € {0,1}*, D halts and returns a single bit. When 
D returns 0 we say that it rejects, otherwise we say that it accepts. We use TIMEp (n) to denote the time 
complexity of D on inputs of the form (n,...). 


Definition 6.9. Let 7 be a finite set and let G = (T, E) be a graph. A (T, G)-typed normal form verifier 
is a pair V = (S,D) where S is a (7, G)-typed sampler with field size q(n) = 2 and D is a typed decider. 


Definition 6.10. Let V = (S,D) be a (7,G)-typed normal form verifier. For n € N, we define the 
following nonlocal game V, to be the n-th game corresponding to the verifier V. The question sets X 
and Y are T x {0,1} T!MEs("), The answer sets A and B are {0,1}7'MEo(), The question distribution 
is the distribution us , Specified in Definition 6.5. The decision predicate is the function computed by 
D(n,u,x,v,y,a,b), when the inputs ((u,x),(v,y),a,b) are restricted to ¥ x Y x A x B. The value of 
the game is denoted by val*(V,,). 


For w € {A,B} and a question (u, x) to player w we refer to u as the question type and x as the question 
content. 


6.2 Graph distributions 


We describe a construction of conditionally linear distributions which sample from the graph distribution 
(see Definition 6.2) of a graph G = (U, E). We begin with a technical definition, followed by the definition 
of the conditionally linear distribution. 


Definition 6.11 (Neighbor indicator). Given a graph G = (U, E), the neighbor indicator of a vertex u € U 
is the vector neigh, (u) € FY in which, for all v € U, 


; 1 if{u,v} €E, 
neigh- (u)v = 

6 cl Jo $ otherwise. 
In addition, the Fz-encoding of a vertex u € U is the vector encg(u) € FY x FY given by encg(u) = 
(eu, neighg(u)), where e, is the standard basis vector with a 1 in the u-th position and 0’s everywhere else. 


Definition 6.12 (Graph sampler). Let G = (U, E) be a graph with n vertices. Then the conditionally linear 
functions corresponding to G are the pair of functions LS; LB on linear space Vg specified in Fig. 2 where 
Ve = Wa © Vna © Vvs © Vrye. 


These conditionally linear functions do not simulate the graph distribution in the sense of sampling 
directly from it. The following proposition, however, does show a sense in which these functions simulate 
the graph distribution, namely via rejection sampling. 


Proposition 6.13 (Simulating the graph distribution). Let G = (U,E) be a graph with n vertices and m 
edges, k of which are self-loops. Let LÈ, LÈ be the conditionally linear functions corresponding to G (see 
Figure 2 for the definition and associated notation). Let (x,y) ~ u LA, LB- Consider the event Eg that there 
exists u,v € U such that the following two statements are true. 


(i) xVva Vna — encg(u) and y “vs OVNB = encg(v), 


61 


Subspaces 
Wa VNa Vys VNB 
Fy Fy Fy Fy 


Conditionally linear function lie 


Ist factor subspace Vva © Vna 

Ist linear function Identity function 

2nd factor subspace Vvp © VNB 

2nd linear functions For all x € Vva ® Vya, suppose there exists a u € U such that 


x = encc(u). Then for all y € Vyg ® Vye, Le 2, x Zeroes out all 
entries of y except for (ye Ju. Otherwise, lÈ ay = 0. 


Conditionally linear function L8 


Ist factor subspace Vvp ® Vxp 

Ist linear function Identity function 

2nd factor subspace Vva ® Vna 

2nd linear functions Similarly defined as those for LÈ by swapping Vya and Vya with 


Vvp and Vyp respectively. 


Figure 2: Specification of the conditionally linear functions corresponding to G. 


(ii) (x8), = (yY™ j = 1. 
Then 
1. Pryy(Eg) = (2m — k) /16". 
2. Conditioned on Eg, (u,v) are distributed as the graph distribution of G (see Definition 6.2). 


Note that Eg occurs if and only if both xV and yVNA are nonzero. In particular, if x and y are sampled 
from LA, LB and given to the respective players, then at least one of them knows when the event Eg does not 
occur. 


Proof. Let z be drawn uniformly at random from Vy, ® Vna ® Vvp ® Vye, and let x = LA) and 
y = LB (z). Then with probability n?/16", there exist u,v € U such that 


yVva Vna — encg(u) and y vB OVne — encc(v) . 


Conditioned on this occurring, u and v are distributed as independent, uniformly random vertices in U. If 
we further condition on {u,v} € E, which occurs with probability (2m — k) /n?, then by definition, (u, 7) 
is distributed as the graph distribution of G. But this event is exactly the event that Eg holds on (x,y), 
establishing the proposition. O 


62 


6.3 Detyping typed verifiers 


We give a canonical method for taking a typed normal form verifier and producing an untyped normal 
form verifier which simulates it. Throughout this section, 7 denotes a finite set, G = (7,E) denotes a 
graph, and Lê, LB denote the conditionally linear functions corresponding to G acting on the vector space 
Vo = Vya ® Vya © Vvg © Vye of dimension 4 - |T | over Fo, as in Definition 6.12. 


Definition 6.14 (Detyped CL functions). Let Lê = {Lê}, LBP = {LB} be T-typed families of ¢-level 
conditionally linear functions on V. We define the detyped CL functions corresponding to (L^, LB) on G to 
be the pair of (£ + 2)-level CL functions (R“, RB) = Detypeg(L*, L?) on linear space Vperype = Vo OV 
as follows. For w € {A,B}, and z € L& (Vc), define the family of ¢-level CL functions {L?} on V as 


i Vno — 
v 0 if z7 = 0, 
í LY otherwise, for zW? = e. 


We note that when z⁄^7 is nonzero, it is always the case that zY» = e, for a unique type t, by Definition 6.12. 
For w € {A,B}, R” is the concatenation of LE and {LY }; (cf. Lemma 4.7). 


Definition 6.15 (Detyped samplers). Let S be a (T, G)-typed sampler. For each n € N, let {L&"}, {LP "} 
be the CL functions of S on index n, and set (R&”, R”) = Detypeg(L^”, L”). Then the detyped 
sampler Detype(S) is the (standard) sampler whose CL functions on index n are R&”, RB”, Its dimension 
function is Spgrype(1) = 47 | + s(n). 


We note that Detype(S) is indeed a sampler in line with Definition 4.14, i.e. a sampler without types. 


Definition 6.16 (Detyped deciders). Let D be a typed decider. We define the detyped decider Detypec(D) 
to be the (standard) decider that behaves as follows: on input (n, x, y, a, b), it attempts to parse x = (x’, x”), 
y = (y’,y") € Ve x {0,1}* (using a canonical scheme for representing pairs of strings). If it cannot, it 
accepts. Otherwise, suppose that there exists {u,v} € Eg such that, using notation from Definition 6.11, 


x! = (eu neigh, (u),0, eu) , y = (0, ey, ey, neigh, (v)) Wa VNA Vvp VNB $ 
Then it returns the output of D on input (n, u, x”, v, y”,a, b). Otherwise, it accepts. 


We note that we include the subscript “G” in Detypeg(D) because we do not specify a graph when 
defining a typed decider (cf. Definition 6.8). This is different than the situation for samplers S, where we 
do not include a subscript in Detype(S) because the graph G is already specified when defining S. 


Definition 6.17 (Detyped verifiers). Let V = (S, D) be a (7, G)-typed normal form verifier. We define the 
detyped verifier, denoted by Detype(V), to be the (standard) normal form verifier (Detype(S), Detypec(D)). 


Lemma 6.18 (Typed verifiers to detyped verifiers). Let V = (S, D) be a (T , G)-typed normal form verifier. 
The detyped verifier Detype(V) = (Detype(S), Detypec(D)) satisfies the following properties: for all 
neN, 


1. (Completeness) If V, has a value-1 PCC strategy, then Detype(V) has a value-1 PCC strategy. 


2. (Soundness) If val" (Detype(V)n) > 1— e, then val*(V,) > 1—16!7! - e. Furthermore, 


E (Detype(V)n, 1 — £) > E (Vn, 1— 1617! - £). 


63 


3. (Sampler parameters) If S is an ¢-level sampler, then Detype(S) is an (£ + 2)-level sampler. The 
time complexity of Detype(S) satisfies the following: 


TIMEpetype(s) (n) = poly( IT |, TIMEs (n)). 


4. (Decider complexity) The decider Detypecg(D) has time complexity poly(|T |, TIMEp(n)). 


5. (Efficient computability) The descriptions of Detype(S) and Detypeg (D) are polynomial time com- 
putable from the description of G and the descriptions of S and D, respectively. 


Proof. Throughout this proof, we fix an index n. Let s = s(n) be the dimension of S. The ambient space 
of S is V = F3 and the ambient space of Detype(S) is Vperyere = Vo © V. Let u,v € T. For this proof, 
we introduce the notation 


view“ (u) = (eu neigh, (u), 0, eu), view? (v) = (0, ey, ey, neigh (v)) € Vva © Vna © Vvs © Vye . 


Supposing that players A and B receive x and y in Vperypg, and supposing that (x,y) satisfies event Eg 
from Proposition 6.13, then xe = view*(u) and yc = view? (v) for a unique {u,v} € E. 


Completeness. Let .7 = (|),A,B) be a value-1 PCC strategy for V,,. We construct a PCC strategy 
SPETYPE for Detype(V), with value 1. This strategy also uses the state |=). When a player receives a 
question, they perform measurements described as follows. 


Player A: given x € Vpgrype the player checks if for some u € T, xe = view’ (u). If so, they perform 
the measurement 
ae 


to obtain an outcome a, which they use as their answer. If not, they reply with the empty string. (This 
entails performing the measurement whose POVM element corresponding to the empty string is the 
identity matrix.) 


Player B: given y € Vpgrypr, the player checks if for some v € 7, yc = view® (v). If so, they perform 
the measurement 
{ (v, e 
B b 


to obtain an outcome b, which they use as their answer. If not, they reply with the empty string. 


This strategy is projective and consistent because the only measurements it uses are those in .Y and “trivial” 
measurements containing the identity matrix. Suppose the players receive questions x and y such that both 
xc = view*(u) and ye = view? (v). In this case, the questions (u, x”) and (v, y”) are in the support 
of the question distribution of V. As a result, the players succeed with probability 1 on these questions, 
and their measurements always commute. For the remaining pairs of questions, the decider Detypec(D) 
always accepts, and the measurements always commute by virtue of the fact that at least one is trivial, i.e. 
containing the identity matrix as a POVM element. 


64 


Soundness. Let .% = (|y), A, B) be a strategy for Detype(V), with value 1 — e. Suppose G has m edges, 
k of which are self-loops. For any (x,y) drawn from ppetype(s),n» the decider Detypeg(D) automatically 
accepts unless (x¥,y"<) satisfies event Eg from Proposition 6.13, which occurs with probability (2m — 
k)/16!7!. When this happens, xe and ye are distributed as view^ (u) and view? (v), where (u,v) are 
distributed as the graph distribution on G. As a result, conditioned on Eg, the probability that .W succeeds 
on Detype(V),, is equal to the probability that the strategy .7’ = (|), A’, B") succeeds on V„, where 
(Ae = Amer ws) 


a 


: Bye = pow (vy) i 


This means that 


j val* (Vy, P") 


val* (Detype(V)n,-%) = (1 al- £) no 


16!7| 16!7| 
< {1 : + 1 (Va, F") 
= ier). tot) A er a 
Thus, .“’ has value at least 1 — 16'7! . e. This proves the first statement in the soundness. As for the second, 


Z and Z’ use the same state |y}, and therefore both strategies have the same Schmidt rank, which by 
definition is at least £ (V, 1 — 16171 -e). 


Complexity. Definition 6.14 implies that Detype(S) is an (£ + 2)-level sampler by Lemma 4.7 and that 
Voerype has dimension 4|7 | + s. The claimed time bounds of Detype(S) and Detypec(D) follow from the 
fact that these perform simple, poly(|7 |)-time computations followed by running S and D as subroutines. 

O 


65 


7 Classical and Quantum Low-degree Tests 


In this section we introduce the classical and quantum low individual degree tests, that form the basis of 
our classical, quantum-sound and quantum PCP constructions. For quantum soundness of the classical test 
we refer to [JNV* 20]. For the quantum test, we refer to [NV18a] and its adaptation to the low individual 
degree test given in Appendix A. The protocol in this work combines both of these tests, using the quantum 
low individual degree test for question reduction and the classical low individual degree test for answer 
reduction. Section 7.1 below introduces the classical test, and Section 7.3 does the same for the quantum 
test. Prior to doing this, we introduce the Magic Square game in Section 7.2, a key subroutine in the quantum 
test. 


7.1 The classical low-degree test 


We begin with a generalization of the classical low-degree test known as the “simultaneous individual low- 
degree test”. We sometimes refer to this as the “classical low-degree test” for short. The low-degree test is 
used as a subroutine in the Pauli Basis test (see Section 7.3) as well as the answer-reduction normal form 
verifier (see Section 10). We describe the test as a nonlocal game in Section 7.1.1 and show that the question 
distribution can be implemented as a (typed) conditionally linear distribution (see Section 6 for the definition 
of typed CL distributions). 


7.1.1 The game 


The game 6'” is parametrized by a tuple Idparams = (q,m,d,k) where m,d,k € N are integers, q € IN 
is an admissible field size, and m divides q. We sometimes write G idparamë to emphasize the dependence of 
the classical low-degree test on the parameter tuple Idparams. 

We first provide a high-level description of the game for k = 1. It is based on the low-individual 
degree variant of the multilinearity test of [BFL91], where one player (called the “points player”) receives a 
random point u € F”, and the other player (called the “lines player”) receives a random axis-parallel line 
£ that contains u, i.e. the line parallel to the i-th axis which goes through point u, for uniformly random 
i € {1,...,m}. The points player is supposed to respond with a value a € IF,, and the lines player is 
supposed to respond with a univariate polynomial fọ of degree at most d such that fọ(u;) = a, where u; is 
the i-th coordinate of u. 

Suppose the players agree beforehand on a global polynomial g : EF — F; such that every variable 
has degree at most d. Then a winning strategy is the following: the points player responds with g(u), and 
the lines player responds with the univariate polynomial fp that is the restriction of g to line £; formally, for 
t € Fy, 

felt) = g(u1, U2, . . . , Ui, t, ay ong Um) x 


It is easy to see that f(t) has degree at most d in the variable t. Thus in this case the players will pass the 
low-degree test with probability 1. The low-degree testing theorem states that the converse approximately 
holds: if the players pass the low-degree test with probability close to 1, then their responses must be 
approximately consistent (in some sense) with a global polynomial g with individual degree d. 

When k > 1, the players are supposed to respond with a tuple of answers; and the test is intended 
to check that the players’ responses are consistent with a k-tuple of functions g; : Fy — IF, that are 
polynomials where every variable has degree at most d. 

The game that we use is slightly more elaborate than the “axis-parallel lines-versus-points” test just 
described. In addition, we have two other subtests. One is a “diagonal line-vs-points” test: here, the points 


66 


player receives a point u € FF as before, but the lines player receives a line @ that goes through u but is 


not necessarily axis-aligned. Instead, it’s chosen by first picking an index i € {1,2,...,m} uniformly at 
random, picking v € Fy uniformly at random, and letting v’ = 7t;-1(v) = (0,0,...,0,0;,0j41,---,0m) 
(i.e., 7T;—1 zeroes out the first í — 1 coordinates of its input). Then, the line is defined as the set 


£={ut+to':teF,}, 


which is a uniformly random line in the affine subspace of Fy corresponding to all points v whose first 
(i — 1) coordinates match those of u. The points player responds with a value a € F}, the lines player 
responds with a univariate polynomial fọ of degree md,” and the verifier checks that fe(tu) = a (where 
tu € F; is the projection of u onto the line £ with respect to a canonical parameterization of £). 

The other additional test is a “consistency test’, where both players receive either the same points ques- 
tion, the same axis-parallel lines question, or the same diagonal lines question. In all cases they are expected 
to respond with the same answer. 


Axis-parallel lines and diagonal lines. Before presenting a formal description of the game 6'”, we first 
define what we mean by lines, and their properties. 


Definition 7.1 (Lines). A line £ in F} specified by a pair (u, v) € (F}')? is the subset 
(u,v) ={ut+to:teRF,} CR. (47) 


We call the vector v a direction of line (u,v).”° 


A line (u,v) is axis-parallel if there is a coordinate i € {1,2,...,m} for which v is equal to e;, the 
i-th elementary basis vector in F”. We say that such a line £ is parallel to the i-th direction. A line £(u, v) 
is diagonal if there exists ani € {0,1,...,m—1} for which v1 = v2 =--- = v; = 0. 


The following Proposition establishes that, for a fixed direction, a line £ is an equivalence class of points 
on the line. 


Proposition 7.2. Let u,v € Fș'. Then for all u' € (u,v), we have that (u,v) = &(u',v). 


Proof. Fix u’ € (u,v). We have that u’ = u + tv for some t € F}. Leta € (u,v). This means that 
a = u + sv for some s € F}. We also have that a = u + (s — t+ t)v = u + tv + (s — t)v = u' + (s — t)v, 
so £(u,v) C £(u',v). A similar argument shows that £(u’, v) C (u,v). O 


We now specify a canonical representation of lines 0 = (u,v) as pairs (uo, v) where ug is a canonical 
representative of L specified as follows. Let LEN : IF” — F” denote the canonical linear map with the 
one-dimensional kernel basis {v} (see Definition 3.10 for a definition of canonical linear map). Note that 
for all u € £, we have that L5N(w) is a point in 4. This is because since the image and kernel of LẸ" 
are complementary subspaces by definition (see Definition 3.10), any u € EF has a unique decomposition 
u = Up + Uy where up is in the image of Lo and u € ker(L5^), which by definition is the subspace 
spanned by v. Thus u; = tv for some t € F} and we have that uo = u — tv € £. In particular, if u € £ 
then uo = LEN(u) € £as well. Furthermore, since LE" is a projection onto its image by definition, we have 
LIN (uo) = uo. 

Next, note that for any u, u’ € £, since u — u’ = sv for some s € F, it follows that LẸ^ (u) = L5N(u’). 
Thus uo = up and this shows that LIN gives a means to compute a canonical representative of lines. 


?7 Since £Z is no longer axis-parallel, the restriction of an individual degree-d polynomial g to a line £ may yield a degree md 
polynomial. 
?8For convenience we explicitly allow v = 0, in which case the line is reduced to a singleton. 


67 


Definition 7.3 (Canonical representative of a line). Let £ be a line in direction v. A canonical representative 
of £ is the point uo € Fọ' such that 


for all u € £. 


Question distribution. We now present the question distribution of the game 6*? as a typed CL distribu- 
tion (see Section 6 for the definition of typed CL functions and typed CL distributions). The type set 7*” 
of the game 6"? we use is 

T*? = {PoINT, ALINE, DLINE} . 


The distribution over question types is uniform over 7‘? x T™®. 

We introduce the corresponding typed CL functions Lpoiwr, Lavine, LDLine where the corresponding CL 
distributions UL 4; ic,Lpomr ANd UL py we,Lpowr UMplement the axis line-versus-point and diagonal line-versus- 
point distributions, respectively. The functions are parametrized by a field size g, a dimension m, and three 
complementary register subspaces Vx, Vy, Vy of some ambient space. The register Vx (which is isomorphic 
to F7) is called the point register, Vy (which is isomorphic to F,) is called the coordinate register, and Vy 
(which is isomorphic to Fp) is called the direction register, respectively. We let V denote the direct sum 
Vx © Vi © Vy. We identify elements of V as triples (u,s,v) € Fy x F; x FF. 

Define Lporr to be the 1-level CL function that projects onto Vx: for every (u,s,v) € V, 


Lpowr(u,s,0) = (u,0,0) . (48) 
Define Larne to be the following 2-level CL function: 
Larmu, s, v) = (LEN(u),s,0) for all (u,s,v) € V, (49) 


where L" is the linear function used to compute the canonical representative of lines (see Definition 7.3), 
and i = x(s) where we define x7(s) to be the unique integer such that 


s=(x(s)-)4+r (50) 


for some 0 < r < q/m, and we interpret the field element s € F} as an integer between 0 and q — 1). The 
function Late is the concatenation of the 1-level CL function that projects the subspaces Vi @ Vy down 
to Vj (i.e. zeroes out the entire Vy register), and the family of 1-level CL functions {LN} acting on Vx, 
indexed by i € {1,2,...,m}. (See Lemma 4.7 for definition of CL function concatenation.) Thus Larine 
is a 2-level CL function. 

Define Lpre to be the following 3-level CL function: for all (u,s,v) € V, 


Lorme(u, 8,0) = (LE (uw), 8, 7i-1(2)) (51) 
where i = x(s). The function Lprıne is the concatenation of the following CL functions (again, see 


Lemma 4.7 for definition of CL function concatenation): (i) the identity function on V; (which is a 1- 
level CL function); (ii) for every i € {1,2,...,m} the 1-level CL function on Vy that maps v ++ 7t;-1(0) 
(i.e. it zeroes out the first i— 1 coordinates of v), and (iii) for every pair (s, v’ ) € Vi ® Vy, the 1-level CL 
function LEN acting on Vx. Thus Lpr is a 3-level CL function. 

The following two lemmas establish that the CL functions Lporr, Laine, LpLine give rise to the axis- 
parallel line-versus-point and diagonal line-versus-point distributions. 


68 


Lemma 7.4. Consider the pair (L, u) sampled in the following way: first, sample the tuple ((uo,s,0), (u,0,0)) 
from the CL distribution UL. ..,Lpowr and let l = (ug, e;) where i = x(s). Then u is a uniformly random 
point in Fy and £ is a line that goes through u and is parallel to a uniformly random axis i € {1,2,...,m}. 


Proof. Since s is uniformly random in F, and m divides q by assumption, we have that 7 is uniformly random 
between {1,2,...,m} and therefore (uo, s) specifies a uniformly random axis-parallel line (19, e;). Since 
uo = LEN(u), we have that u € (uo, e;). oO 


Lemma 7.5. Consider the pair (L, u) sampled in the following way: first, sample the tuple ((uo,s, v), (u,0,0)) 
from the CL distribution Utor: Loon and let £ = L(uo, v) where i = x(s). Then u is a uniformly random 
point in Fy and £ is a random diagonal line that goes through u and shares the first i — 1 coordinates with 
u, for uniformly random i. 


Proof. The proof of this follows identically to Lemma 7.4. O 
For later convenience we make the following definition. 


Definition 7.6 (Line-point distributions). The axis line-point distribution Dayne is the distribution HL ., ..,.,Lpowr 
over pairs (l = (uo,e;),U) (we omit the 0 elements for simplicity). The diagonal line-point distribution 
Dprme is the distribution Lp) .:,Lpowr OVer (E = (uo,s, v), u). The line-point distribution Dirne is the 
equal mixture of the axis-parallel line-point distribution and the diagonal line-point distribution. In other 
words, a sample from Dre is distributed as follows: with probability F, output a sample from D ALine, and 
with probability F, output a sample from Dp ne. 


Decision procedure The decision procedure D™ for the game 6'” is presented in Figure 3. The table at 
the top specifies a parsing scheme for the questions and answers, depending on the type of question. For 
example, when a player receives a question with type POINT, the question content x, a bit string of length 
m log q, should be interpreted by the decision procedure and the players as an element of the vector space 
Fy, as indicated in Section 3.3.2. Similarly the answer to a question with type POINT is expected to be a bit 


string of length k log q, and is interpreted as an element of IF‘. For questions with type ALINE, the question 
content is a (mlogq + log q)-bit string, which is parsed as a pair (ug,s) € Fp x Fy, which in turn can 
be interpreted as a specification for an axis-parallel line £(uo,e;) in Fj’ where i = x(s), as described in 
Lemma 7.4. The answer is interpreted as the description of k degree-d univariate polynomials defined on 
the line (uo, e;), each polynomial given by a list of d + 1 coefficients. For questions with type DLINE, the 
question content is a (2m log q + log q)-bit string, which is parsed as a triple (uvo,s,0) € FẸ x Fy x Fy 
that specifies a diagonal line (uo, v’) where v’ = 7j_;(v) as described in Lemma 7.5. The answer is 
interpreted as the description of k degree-md univariate polynomials defined on the line (uo, v’). If the 
answers returned by the players do not fit this format the decision procedure rejects. 

We define a special class of measurements that are relevant to the soundness properties of the low-degree 
test. 


Definition 7.7 (Low-degree polynomial measurements). Define PolyMeas(m,d,q) to be the set of POVM 
measurements whose outcomes correspond to (individual) degree-d polynomials of m variables over F}. 
More generally, for an integer k and tuples m = (m1, m2, . .., mg), d = (dy,do,...,dx) and q = (q1,q42,.--,9k), 
we let PolyMeas(m, d, g, k) be the set of measurements G = {Gg,,¢,,...,, } such that fori € {1,2,...,k}, 

gj is a polynomial g; : Fj; — IF,, with individual degree d; (see Section 3.4 for a definition of individual 
degree). 


69 


Type Question Content Answer Format 


POINT u € Fy Element of Fi 
ALINE (uo,s) € Fy’ x Fy k degree-d polynomials f; : F; > F; 
DLINE (uo,5,0) € Fy x Fy x FF k degree-md polynomials f; : F} —> F4 


Input to DP: (ta, Xa, tg, Xg,aĘa, ap). In all cases where no action is indicated, accept. For 
w € {A,B}, 


1. (Consistency test) If ta = tg, accept iff aa = ap. 


2. (Axis-parallel line-versus-point test) If ty = ALINE and tg = POINT, accept iff 
fi(t) = (aw); for all j € {1,2,...,k} where t € FF, is such that xm = uo + te; where 
i = xe) 

3. (Diagonal line-versus-point test) If ty = DLINE and ty = POINT, accept iff f;(t) = 
(az); for all j € {1,2,...,k} where £ € IF, is such that xw = uo + tv’, where v' = 
Ti—ı(v) with i = x(s). 


Figure 3: The decision procedure D™? for the simultaneous low-degree test, parameterized by Idparams = 
(q,m,d,k). The function x(s) is defined in Equation (50), and 7t;_1(v) zeroes out the first i— 1 coordinates 
of v. 


Quantum soundness of a version of the classical low-degree test for polynomials of low total degree was 
claimed in [NV 18a], building on an analysis of a three-prover version of the test in [Vid16]. Unfortunately 
there is a gap in the soundness analysis of this test. For our construction we use the low individual degree 
test, whose quantum soundness is shown in [JNV* 20] and generalized in [JNV*‘ 21]. We state the result in 
the form that is most directly useful for us, and show how the stated result follows from [JNV* 21]. 


Theorem 7.8 (Quantum soundness of the simultaneous classical low-degree test). There exists a function 
ĉe q, m, d, k) = a(dmk)* (e! + q} + anemi) for universal constants a > 1 and0 < b < 1 such that 
the following holds. For all e > 0 and parameter tuple \dparams = (q, m, d, k), for all projective strategies 
(4, A, B) that succeed with probability at least 1 — € in the game ©\gyarams: there exists measurements 


G” € PolyMeas(m, d, q,k) 
on Hy», for w € {A,B}, such that 
APOINT, u i Q Ig Sn In Q Gi 


A, AQ, .., (91 (U),92(1),-.8k(U))=(a1, 42, ...,4K)] 7 
Gi Q Ig = In Q BPOINT, u 


(81 (4),82(t) 8k (UH) )= (a1, a2, --- Ak) M42, ++ Ak 7 


A pe B 
Goi, bop aise 8 Ip Sön a ® Goi, 827., Sk 7 


where ĉ&p = ôw (£, q, m, d, k), and all three statements are with respect to the state |p). In addition, the 
first two approximations hold under the uniform distribution over u € F”, whereas there is no question 
distribution associated with the third approximation. 


70 


Proof. Let Idparams = (q,m,d,k) where m,d,k € N are integers, q € IN is an admissible field size, and 
m divides q. Let & = F}. Let C C X1 be the Reed-Solomon code with degree d. Explicitly, C is the set of 
all (p(x1),.--,P(Xn)) where p : F} — F; is a univariate polynomial of degree at most d and x1,...,Xn are 
an enumeration of F}. Then in the (standard) notation from [JNV‘21] C is a linear [q,d + 1,n —d+1]5 
code. 

Observe that the two-prover tensor code test, for the code C 8m as described in [JNV*21, Figure 1 and 
Figure 2], is identical to the game 6"? with parameters Idparams and k = 1, whose decider is presented in 
Figure 3. This is because 


1. The axis-parallel line-versus-point test from Figure 3 is identical to the axis-parallel lines test from [JNV* 21, 
Figure 1], with the only difference that in Figure 3 the roles of A and B are chosen uniformly at ran- 
dom. This is consistent with the two-prover variant of the tensor code test from [JNV*21, Section 
4.1]. 


2. The diagonal line-versus-point test from Figure 3 is identical to the subcube commutation test from [JNV* 21, 
Figure 1], with again the added symmetrization between players. 


3. The consistency test from Figure 3 is identical to the synchronicity test from [JNV* 21, Figure 2]. 


As a result, [JNV*21, Theorem 4.7], in which we choose the parameter k as k = md, immediately 
yields Theorem 7.8. 

To conclude it remains to extend the result to the case of general k. This is done via a standard reduction, 
following exactly the same steps as the derivation of Theorem 4.43 from Theorem 4.40 in [NW 19]. O 


7.1.2 Complexity of the classical low-degree test. 


The CL functions and decision procedure of the low-degree test are incorporated as subroutines in some 
of the normal form verifiers constructed in subsequent sections. The next lemma establishes the time com- 
plexity of these procedures as a function of the parameter tuple Idparams = (q, m,d, k). The lemma also 
establishes the time complexity of computing the description of the decision procedure D™? as a Turing 
machine, given the parameter tuple Idparams as input. 


Lemma 7.9 (Complexity of the classical low-degree test). Let Idparams = (q, m, d, k) denote a parameter 
tuple. 


1. The time complexity of the decision procedure D'” parametrized by |dparams is poly(m,d,k, log q). 


2. The time complexity of evaluating marginals of the CL functions Lpor, Layne, and Lp tng at a given 
input point is poly(m, log q). 


3. The Turing machine description of the decision procedure D‘? parametrized by |dparams can be 
computed from \dparams in polylog(q,m,d,k) time. 


Proof. Finite field arithmetic over F} can be performed in time polylogq, by Lemma 3.18. The most 
expensive step in D™ is to evaluate a univariate polynomial f : F} — Fi at some point tu € F; that 
corresponds to a point u € F”, which takes time poly (m, d, k, log q). The function Lpormr is a projection 
onto Vx, which takes time poly(m, log q) to compute (it just involves “zeroing out” the registers outside of 
Vx). The functions Latte and Lpriwe require computing a canonical linear map, which requires performing 


Gaussian elimination and can be done in time poly(m, log q). 


71 


The Turing-machine description of the decision procedure D'” can be uniformly computed from the 
integers (q, m, d, k) expressed in binary; the complexity of computing the description comes from describing 
the parameter tuple Idparams, which takes time that is at most polynomial in the bit length of (q, m, d, k). 

O 


7.2 The Magic Square game 


We recall the Magic Square game of Mermin and Peres [Mer90, Per90, Ara02]. The Magic Square game 
is a simple self-test for EPR pairs (it tests for two of them). In addition, it allows one to test that a pair of 
observables anticommutes. Here we use it as a building block to construct the quantum low-degree test. 

There are several formulations of the Magic Square game; here we present it as a binary constraint 
satisfaction game [CM14]. In this formulation of the game (denoted by Öms) there are 6 linear equations 
defined over 9 variables that take values in F2. The variables correspond to the cells of a 3 x 3 grid, as 
depicted in Figure 4. Five of the equations correspond to the constraint that the sum of the variables in each 
row and the first two columns must be equal to 0, and the last equation requires that the sum of the variables 
in the last column must be equal to 1. 


Figure 4: The Magic Square game 


The question set 7™MS of the Magic Square game is the following: 


7 “© = (CONSTRAINT; : i = 1,2,...,6}, 
TMV = { VARIABLE; : j = 1,2,...,9}, 
TMS = Me My . 


The questions CONSTRAINT; for i € {1,2,3} correspond to the three row constraints, the questions 
CONSTRAINT,4, CONSTRAINTs correspond to the first two column constraints, and question CONSTRAINT6 
corresponds to the third column constraint. 

In the Magic Square game, the verifier first samples a constraint CONSTRAINT; € 7™© uniformly 
at random, and then samples VARIABLE;, one of the three variables in the row or column correspond- 
ing to CONSTRAINT;, uniformly at random. One player is randomly assigned to be the CONSTRAINT 
player, and the other is assigned to be the VARIABLE player. The CONSTRAINT player is sent the question 
CONSTRAINT; and is expected to respond with three bits (Bv, Bos, vs) € F3, where (v1, v2, 03) are the 
indices of the three variables corresponding to CONSTRAINT;. The VARIABLE player is given question 
VARIABLE; and is expected to respond with a single bit y € F2. The players win if the CONSTRAINT 
player’s answers satisfy the equation associated with CONSTRAINT;, and y = bj. More precisely, the veri- 
fier samples an edge of the type graph (see Section 6) GMS in Fig. 5, sends one endpoint to a random player, 
and the other endpoint to the other player. 

The following theorem records the self-testing (also known as rigidity) properties of the Magic Square 
game. Its self-testing properties are crucial to the Pauli basis test. In particular, it is used to enforce anti- 


72 


commutation relations between certain pairs of operators. We will refer to this theorem in the proof of the 


VARIABLE, 


CONSTRAINT; VARIABLE 

VARIABLE3 
CONSTRAINT? 

VARIABLE4 
CONSTRAINT} 

VARIABLES 
CONSTRAINT4 

VARIABLE6 
CONSTRAINTS 

VARIABLEz 
CONSTRAINT6 

VARIABLEg 

VARIABLE9 


Figure 5: Type graph GMS for the Magic Square game. 


soundness of the Pauli basis test (Theorem 7.14 below) in Appendix A. 


Theorem 7.10 (Rigidity of the Magic Square game). Let Z = (|p), A,B) be a strategy that succeeds in 
the Magic Square game ®™S with probability 1 — £, where lp) € Ha @ Hp is a state. Then there exist local 
isometries ġa : Ha + Ha D Han, Qg : Hg > Hp D Hpr (where Ha, Hg = 


are finite dimensional) and a state |AUX) E€ Ha Q Hp» such that 
1. ||P 2 ga|p) — |EPR2)®? ® Jaux) || < O( V2), 


2. Letting AX = p4 Ax ġ}, and BY = pp By Gi, we have that 


where the = 


4 VARIABLE 

A, 1 Q Tprpn fe (oe) a Q Tan pip 

4 VARIABLE 

A, 3 & Ig'g" XA (of )ar Q langg" 
5 V. 

aran Q Bi ARIABLE; XE Ix angr Q (oČ)p 


Iyar Q Bree SVE Tar arg" Q (of )g 


„z statement holds with respect to the state |EPR») 973, Q |AUX) arg" and the answer 


summation is over b € {0,1}. As a consequence, letting 


it holds that 


A VARIABLE n ĀVARIABLE: Z ANARIABLEL 
~ *"0 1 


A VARIABLES — ANARIABLES _ ANARIABLES 
~ “0 1 


A VARIABLE] A VARIABLES Q Iprpn Ri — A VARIABLEs 4 VARIABLE] Q Igp" 


Iyar Q VARIABLE, BYARIABLEs x 


yE —LIaran Q 


73 


(cy and Han, Hpn 


BYARIABLEs BYARIABLEL 
7 


where the notation M %5 N for M,N that are not POVM elements means 


(p|(M —N)"(M—N)|p) < O(6). 


Proof. A proof of the rigidity of the Magic Square game can be found in [CS17, Theorem 6.9]. There are 
a couple minor differences between their rigidity statement and the one stated here. First, they state that 
the robustness of the Magic Square self-test is O (e); however they specify the closeness between the actual 
and ideal states in terms of the trace norm, whereas we specify the closeness between pa © pl) and 
|EPR2)®? @ |AUX) in terms of the Euclidean distance, which translates to O(./€) instead of O(¢). Second, 
their choice of ideal strategy specifies that the observables corresponding to VARIABLE, and VARIABLE5 
questions are I & 0% and ogo% @ o“%o*; however under a local change of basis these are equivalent to 
o* & I and g% & I respectively. O 


We will need the following theorem, which shows that any pair of anticommuting observables can be 
used to form a value-1 strategy for the Magic Square game. 


Theorem 7.11. Let A = {Ap}pep, and B = {Bpy}pe, be two-outcome projective measurements acting on 
(C1)©" which are consistent on |EPR,)®", and let O4 = Ag — A; and Og = Bo — By be the correspond- 
ing observables. Suppose that O,Og = —OgO 4. Then there exists a symmetric strategy Z = (p, M) for 
the Magic Square game with the following properties. 


1. Z is an SPCC strategy of value 1. 
2. The state |p) has the form |p) = |EPR,)®" Q |EPR2). 
3. Forb € {0,1}, we have Mee = A, @ land Mee = B, @I. 


Proof. The strategy Z is based on the canonical two-qubit strategy for the Magic Square game as described 
in, for example, [Ara02]. The state is |Y} = |EPR,)”” & |EPR2). We specify the measurements of .7 in 
Figure 6 as an operator solution for the Magic Square game, meant to be read as follows: each cell con- 
tains a two-outcome projective measurement { Eo, E1 } on (C7)®" @ C? written as its +1-valued observable 
Eo — E1. When Player A or B receives the question VARIABLE; for j € {1,...,9}, they measure their 
share of |) using the measurement specified by the cell corresponding to VARIABLE; and receive a single- 
bit measurement. When they receive the question CONSTRAINT; fori € {1,2,...,6}, they simultaneously 
perform the three measurements in the corresponding row or column on |y) to obtain three bits. For exam- 
ple, if Player A receives question VARIABLE, they measure |) using the measurement {Ap 8 I, A; & I} 
corresponding to the observable © 4 ® I (where the first operator acts on |EPR,)*" and the second acts on 
|EPR2)). Similarly, on question VARIABLEs, they use the measurement {Bo ® I, Bı Q I}. This establishes 
Item 3 of the theorem. 

First, we show that this gives a well-defined strategy. The VARIABLE; measurements are well-defined 
because each cell contains a +1-valued observable. This is obvious for all j Æ 9; when j = 9, the bottom- 
right cell contains O 4Og ® o%0%*. Because O 4 and Op anti-commute, 


O,Op © 00% = -040z Q 0% 0% = O70, Q 0% 0% = (040g Q ge). (52) 
As a result, this matrix is Hermitian. In addition, 


(040g @ 040%)? = (040g @ 070%) - (0804 @ 0% 07%) = (040s : OpOa) Q (020% -0* 07) = I, 


74 


Figure 6: Observables for Magic Square strategy 


where the first step uses Equation (52) and the final step uses the fact that O4, Oz,o*,o% are £1-valued 
observables and hence square to the identity. As a result, this matrix is Hermitian and squares to the identity. 
Therefore, it is a +1-valued observable. 

As for the CONSTRAINT; measurements, we must show that the three measurements in each row and 
column are simultaneously measurable. This is equivalent to the three +1-valued observables being simul- 
taneously diagonalizable, which is equivalent to them being pairwise commuting. This can be easily verified 
for the cases of i = 1,2,4,5 (i.e. the first two rows and columns). In the case of i = 3, commutativity of 
O, ® 0% and Og Q o% follows from Equation (52). Since these two matrices commute, they also commute 
with their product (04 8 0“)(Og @ o*) = O4Op ® 040%. The case of i = 6 is similar. 

By construction, .Y is symmetric, and we have already shown that it is projective. It remains to show 
that it is commuting, consistent, and value 1. To show that it is commuting, it suffices to show that the 
measurement for each cell is simultaneously measurable with all three measurements in its row or column, 
which was already proved above. Now we show consistency of the measurements. We will show this by 
instead showing consistency of their observables. By this we mean the following: if E = {Eọ, E1} is a 
two-outcome projective measurement on H. then its corresponding +1-valued observable O = Eg — Ey is 
consistent on a state |p) E€ H Q H if 


O @Ip- |) =I, @O- |). 


This is in fact equivalent to the notion of E being consistent on |p} from Definition 5.9. To see this, if E is 
consistent on |), then so is O, because 


O ® Ip- |p) = (Eo — E1) @ Ip - |$) = In 8 (Eo — E1) - |) = In @O - |), 


where the second equality is by consistency of E. On the other hand, suppose © is consistent on |p}. Then 
because Eg + E1 = I, we can write Ey = 5 -(I1+ O) and E; = 5 - (I — O). Asa result, 


E09 lB: l$) =}: (+0)8 I- |p) = 1184- (1+ O) - lp) = In ® Eo: |), 


and similarly for Ey. Thus, E is consistent on |~). This proves the equivalence. 

Now we use this to prove consistency of .7. To begin, we note that because A and B are consistent on 
|EPR,)®” by assumption, so to are their +1-valued observables O 4 and Og. Next, we note that o* and o” 
are consistent on |EPR2); to verify this, we recall that 7*|0) = |1) and o*|1) = |0}, and so 


o* & Ig: |EPR2) = 0* @ Ig: <5 (100) + |11)) 
= <5 (|10) + |01)) =I, @o%- ai + |00)) = I, & o% - |EPR2). 


75 


This shows o~ is consistent on |EPR2), and a similar argument shows the same for 7“. We now use these 


two facts to show that the observable in each cell of Figure 6 is consistent on |p). To see why, consider the 
j = 9 case: 


(040s ® 070%), ® Ip - |EPR,)®" 9 |EPR2) 
= (0403 ® 07), Q (18 o”) - |EPR,)®”" Q |EPR2) (consistency of 7*) 
= (040g 8 I)a Q (IQ 0%0%)g - |EPR,)°” Q |EPR2) (consistency of 7”) 
= (04 8 I)a Q (Op @ 0% 0%) p - |EPR,)®" Q |EPR2) (consistency of Og) 
= I, Q (OgO, Q 0% 0%) - |EPR,)®”" Q |EPR2) (consistency of O 4) 
= In Q (040g Q 07%0%)s - |EPR,)®”" Q |EPR2). (by Equation (52)) 


The remaining cases of j € {1,...,8} are similar, and we omit them. This shows that the observable in 
each cell is consistent on |Y}, and as a result the corresponding two-outcome projective measurements are 
consistent as well. As a result, the VARIABLE; measurements are consistent. 

As for the CONSTRAINT; measurements, each such measurement Friet a,b,ce {0,1} is of the form F4,b,c = 
El - E - Ee where E!, E?, and E3 are VARIABLE measurements. But then consistency of F follows from 
the VARIABLE consistencies: 


Fap, 8 Igy) = (El : E? j EF’) & Bly) 
= (El . EZ )a Q (E? )e|y) (consistency of E3) 
= (E})a Q (E? - E2)p|p) (consistency of E?) 


= Ia Q (EÈ. E2. E! 
= la @ (E} -E9 E? 
= Ia D Favel). 


Bly) (consistency of E1) 
Bly) (commutativity of E!, E?, and E3) 


Hence, all measurements are consistent. 

Since all measurements are consistent, this implies that the answer bit of the player receiving a VARIABLE 
question is always consistent with the corresponding answer bit of the player receiving the CONSTRAINT 
question. Similarly, the answers of the player receiving the CONSTRAINT question always satisfy the given 
constraint; observe that in all rows and the first two columns, the observables multiply to I, whereas the 
observables in the last column multiply to —I. This implies that the strategy is value-1. To see why, let 
us again consider a CONSTRAINT; measurement, which we write as F = {F} b,c tabe {0,1}. As above, it 
can be written as Fap, = El . E? . E? , where E!, E?, and E? are VARIABLE measurements. Consider the 
measurement S = { Sọ, S1} that corresponds to measuring {F,»} and then outputting the sum of measured 
values of a, b, and c. In other words, 


So = Foo, + Foi, + Fio, + Fis, Sı = Foo1 + Foio + Aoo + fia. 


The following expression gives a convenient formula for the +1-valued observable So — S1: 
(Eð — Ei) - (Eo — Et) - (Ej — E1) = EgEGE — EgEQE] — EgE{E9 + EgE{E} +--+ = So — Si. 


When we are looking at CONSTRAINT; fori € {1,2,3,4,5}, the left-hand side of this equation is equal to I, 
and so I = Sg — S1. Because So and S4 are both positive semidefinite, this is true only if Sọ = I and S; = 0, 


76 


which implies that the three bits output by F always sum to 0. As a result, this strategy always satisfies the 
CONSTRAINT; question. On the other hand, when i = 6, then the left-hand side of this equation is equal 
to —I, and so Sı = I, and the three bits sum to 1. Thus, the strategy always satisfies the CONSTRAINT6 
question as well. This concludes the proof. O 


7.3 The Pauli basis test 


We introduce the quantum low-degree test of [NV 18a] in the form of a slight modification to it by [NW19] 
known as the Pauli basis test. Informally, the quantum low-degree test asks the players to measure a large 
number of qubits and return a highly compressed version of the measurement outcome. The Pauli basis test 
simply asks that the players return their uncompressed measurement outcomes, and it is designed by direct 
reduction to the quantum low-degree test. In Section 7.3.1 we describe the Pauli basis test as a nonlocal 
game GPAU! and show that its question distribution is implementable via a CL distribution, as we did with 
the classical low-degree test in Section 7.1. In Section 7.3.2 we exhibit canonical parameters for the Pauli 
basis test and give bounds on the time complexity of executing the test. 


7.3.1 The game 


@ PAULI 


We start by discussing parameter settings. The game is parametrized by a tuple 


qldparams = (q,m,d) , 


PAULI 


where g,m,d € N are integers. We sometimes write © idpärams 


basis test on the parameters. 

Informally, the test is meant to certify that the players share a state of the form |[EPR;)®™, where 
M = 2”. Its question set includes questions that are axis-parallel/diagonal lines and points in EF which 
are meant to correspond to questions in the classical low-degree test, and questions of the form (PAULI, W), 
for W € {X,Z}. Upon receipt of a question of the latter form, the players are expected to perform the 
POVM {7} negy and report the outcome A as their answer. 


to emphasize the dependence of the Pauli 


The key idea behind the test is for the provers to encode their outcomes using the low-degree encoding 
from Section 3.4. Given an outcome h € ae the low-degree encoding of h is the polynomial gy, : Fy > 
IF,. Rather than asking the provers to always return the entire string h, many of the subtests in the game 
PAULI will “probe” limited information about h as follows: the verifier provides the provers with a string 
uw €E Fy, and using this they are expected to evaluate g,(uw) € IF, and return it to the verifier. For 
a fixed value of W, then, the prover’s responses to these “probes” should be consistent with some low- 
degree polynomial in the input uw. To check this, the verifier can perform the classical low-degree test from 
Section 7.1. This ensures that the prover’s responses for the W = X basis are consistent with each other, 
as well as their responses for the W = Z basis, although it does not test consistency between the X and Z 
responses. 

The next set of subtests the verifier performs ensure that the provers’ X and Z measurements are con- 
sistent with each other. On the X side, consider the measurement in which the prover performs the {i} 
POVM, receives outcome hx, and outputs Shy (Ux) € F}. For technical reasons that will become clear 
shortly, it will be convenient for the prover’s measurement to have two outcomes rather than q outcomes. To 
accomplish this, the verifier also provides the prover with an element rx € F}; given this, the prover should 
output not gy, (ux) € F} but tr(gy,, (ux): rx), which is an element of F2. Similarly, on the Z side, suppose 
the prover is provided rz € F}, performs ie } to receive hz, and outputs tr( gy, (uz) - rz) € F2. Together, 
these two are a pair of two-outcome measurements, one in the X basis and one in the Z basis, and as it turns 


77 


out, this pair of measurements either commutes with each other or anticommutes with each other (this is 
why we modified the measurements to have only two outcomes). The key quantity to determine which of 
these is the case is the following complicated-looking expression: 


y = tr((ind,,(ux) - rx) + (indm(uz) + rz)) . 


If y = 0, then these two measurements commute, and if y = 1, then these two measurements anticommute 
(this fact is established in the proof of Lemma 7.13 below). In the y = O case, the verifier can check 
whether these two measurements commute by performing the “commutation check”, which asks the provers 
to simultaneously measure both tr(gn, (ux) - rx) and tr( gn, (uz) - rz) and report their values. In the y = 1 
case, on the other hand, the measurements anticommute, which implies that they cannot simultaneously 
be measured. In spite of this, the verifier can still check that the measurements anticommute by using the 
Magic Square game from Section 7.2 above. As was shown in Theorem 7.11, any pair of anticommuting 
measurements can be used to form a perfect strategy for the Magic Square game (and indeed the reverse is 
true as well: any perfect strategy for the Magic Square game entails a pair of anticommuting measurements). 
Together, these give the main subtests for the game GPAU", 


Definition 7.12 (Admissible parameters). We say that the tuple qldparams = (g, m, d) is admissible if q is 
an admissible field size (Definition 3.15) and m|q. 


Question distribution. We now describe the question distribution of the game 6?4U"', and show that it is 
a (typed) CL distribution. The question types in the game GPA" are 


T PAUL! = ({POINT, ALINE, DLINE, PAULL, PAIR} x {X,Z}) UT™S U {PAR}, (53) 


where 7 ™S is the question type set of the Magic Square game defined in Section 7.2. 

Before presenting the CL functions of the Pauli basis test, we first give an intuitive description of the 
question distribution of the Pauli basis test: a pair of questions ((ta, xa ), (tg, Xp)) in GP“! can be sampled 
via the following procedure: 


1. Sample a pair of types by sampling an edge (ta, tg) of the graph GPA" given in Figure 7 uniformly 
at random (including the self-loops). 


2. Sample the following uniformly at random: 

(a) (Points) ux,uz € Fy’, 

(b) (Directions) s € Fg, v € Fg 

(c) (Qubit basis for (anti-)commutation) rx,rz € Fg. 
3. Forw € {A,B} and W € {X, Z}, 


(a) If tw = (POINT, W), then set xw = uw, 


(b) If ty = (ALINE, W), then set x» = (uo,s), where uo = LEN(uw), with i = x(s) (see 
Section 7.1.1 for definition of L" and x(s)), 


(c) If ty = (DLINE, W), then set x» = (uo,s,v'), where i = x(s), v! = 7j-1(v), and uo = 
EN (uw), 


(d) If ty = CONSTRAINT; for some i € {1,...,6}, then set Xw = (ux, uz, rx, Tz), 


78 


(e) If ty = VARIABLE; for some j € {1,...,9}, then set xy = (ux, uz, rx, Tz), 
(f) If ty = PAIR, then set xw = (ux, Uz, rx, Tz), 

(g) If ty = (PAIR, W), then set xw = (ux, Uz, rx, Tz), 

(h) If ty = (PAULI, W), then set xw = 0. 


(DLINE, X) (ALINE, X) 


ə (PAULI, X) 
VARIABLE; (POINT, X) 
CONSTRAINT: 
1 VARIABLE? (PATR, X) 
PAIR 
CONSTRAINT? VARIABLE3 
(PAIR, Z) 
CONSTRAINT3 VARIABLF4 (POINT, Z) 
, 
e (PAULI, Z) 
CONSTRAINT4 VARIABLES 
VARIABLE6 
CONSTRAINTS cee eae 
VARIABLE7 
CONSTRAINT¢ 
VARIABLEg 
VARIABLE9 


Figure 7: Graph GPAYH for the Pauli basis test. Each vertex also has a self-loop which is not drawn on the 
figure for clarity. 


We now specify the corresponding CL functions for each of the question types in the Pauli basis test. 


Conditional linear functions for the Pauli basis test. The CL functions corresponding to the Pauli basis 
test question distribution are parameterized by the parameter tuple qldparams = (q,m,d). Let VPAUH 
denote the linear space (Fy)? x IF, x Fy x (IF, )?. The space VPA"! is decomposed into a direct sum 
of the following register subspaces: Vx, Vz (which are m-dimensional), V; (which is 1-dimensional), Vy 
(which is m-dimensional), and Vp,, Vr, (which are 1-dimensional). We identify elements of VPA" as 
tuples (ux, uz, S, v, rx,"z) € (Fr)? x F; x Fy x (IF,)*. We define CL functions Lẹ : VPA — yPAVLI 
for every type t € PAYU, 


1. For W € {X, Z}, define Lpomrt,w (Ux, Uz, S, V, rx, rz) = Lpomr(uw, S, v) where Lpor is the 1-level 
CL function defined in Equation (48).”° 


2. For W € {X,Z}, define Latins w(Ux,Uz,8,0,1x,1z) = Latine(Uw,s,v) where Larme is the 2- 
level CL function defined in Equation (49). 


29The range Vx ® Vi ® Vy or Vz ® Vi ® Vy of Lpoyr is embedded in VPAYH in the natural way. The same convention is used 
also in Item 2 and 3 for L ALme and Lpzine. 


79 


3. For W €E {X,Z}, define Lorine w(x, Uz, S, v, rx, rz) = Lptine(Uw,s,0) where Lpzine is the 3- 
level CL function defined in Equation (51). 


4. Fort € TMS U {Pair, (PAIR, X), (PAIR, Z)}, Li(ux, Uz, s, v, rx, rz) = (ux, uz, 0,0,rx,rz), i.e., it 
projects onto Vx ® Vz ® Vr, © Vrz. Thus L; is a 1-level CL function. 


5. For W € {X, Z}, define Lpavr,w = O as the identically 0 function. Thus it is a 0-level CL function. 


We note that the maps Lpowwr,w, LALiNe, Ww, LDLine,w act as the zero function on Vy ® Vry © Ve, where 
W= XifW =Zand W= ZifW =X. 

The question distribution of GP4U"' is thus a typed CL distribution on JPAYH x VPAULI x PAULI x 
VPAULI where (ta, XA, tg, Xp) is sampled by first uniformly sampling an edge (tA, tg) from the graph GP^AYH 


defined in Figure 7, sampling a uniformly random z € VPA") and then setting x» = Le (z) for w € 
{A,B}. 


Decision procedure. The decision procedure for GPAH is presented in Figure 8. Similarly to Figure 3, 
we provide a table that summarizes a parsing scheme for the questions and answers, depending on the 
type of question. The answers are bit strings that are interpreted as more structured objects such as ele- 
ments over F}, vectors, or polynomials, depending on the question. In the “low-degree check”, the de- 
cision procedure DPV" calls the classical low-degree decision procedure DP parametrized by the tuple 
Idparams = (q,m,d,1) (defined in Section 7.1) as a subroutine. 


80 


Type Question Content Answer Format 


(POINT, W) yE Fy Element of F; 

(ALINE, W) (uw,s) € Fy x Fy Polynomial f : F} + F 
(DLINE, W) (uw,s,v) € F x F} x Fy Polynomial f : F} — F, 
PAIR (ux, uz, rx,rz) € ee x F; (Bx, Bz) € F5 

(PAIR, W) (ux uz, rx, rz) € (F; mE F? Element of F> 
CONSTRAINT; (Ux,Uz,Tx,1z) € (E")2 x F? (Ao, Ao, Xoz) E F3 
VARIABLE; (ux,Uz,1x,1z) € (e x F? Element of F> 

(PAULI, W) 0 Element of EAS 


Table: Question and answer format of the Pauli basis game. 


On input (ta, XA, tB, Xg, 4A, 4p ), the decision procedure DP" performs the following checks 
for w € {A,B}: 


1. (Consistency check). If ta = tg, accept iff aa = ap. 


2. (Low-degree). Let Idparams = (q,m,d,1). 


(a) If tw = (PONT, W),tr = (ALINE, W), accept if Deans accepts 
(POINT, Xw, ALINE, xm, 4w, am). 
(b) If to = (POINT, W),tw = (DLINE, W), accept if Diane accepts 


(POINT, Xw, DLINE, xm, 4w, Am). 
3. (Consistency check). If t, = (POINT, W), tw = (PAULI, W), accept if 8a; (Xw) = aw, 


where ga, is the low-degree encoding of ay € E defined in Section 3.4. 


w 


In the remaining four cases, the decision procedure first computes the number 


y = tr((indm(ux) - rx) - (indm(uz)-rz)), (54) 
where we recall the ind,,,(-) vector from Section 3.4. 


4. (Commutation check). If tw = (PAIR, W), tm = PAIR, accept if aw = By or y 4 0. 

5. (Consistency check). If tw = (POINT, W), tm = (PAIR, W), accept if tr(awrw) = am 
ory £0. 

6. (Magic square check). If t~ = CONSTRAINT;, tw = VARIABLE;, accept if y = 0, or 
Aw satisfies constraint CONSTRAINT; and aj = az. 


7. (Consistency check). If tw = (POINT,W),tg = VARIABLE;, accept if y = 0 or if 
j=1, W = X, and tr(awrx) = aw or if j = 5, W = Z, and tr(awrz) = am. 


PAULI 


Figure 8: Specification of the decision procedure Dal dpárams: 


81 


We describe an honest, value-1 PCC strategy for the Pauli basis test. 
Lemma 7.13. The Pauli basis test Oaks has a value-1 SPCC strategy. 


Proof. We begin by specifying the value-1 strategy Z = (|W), E). The state is 
|p) = |EPR,)°™ @ |EPRo) . 


Now we specify the measurements. We start with measurements associated with questions of type POINT, 
ALINE, DLINE, and PAULI. Using notation introduced in Section 3.4, for W € {X, Zy, 


(POINT,W), W 
Er y a] & I # 


= Tg.y)= 
Ee g Z at E Q1, 
Ep E = Then OL 
a W) = 7 Ql. 


Here, the bracket notation used to post-process measurement outcomes is defined in Definition 3.23. Thus, 
for example, the measurement on question ((POINT,W),y) corresponds to first performing the measure- 
ment (a thee receiving an outcome h € FM, and then outputting the value a = gnly)- 

For ALINE and DLINE questions, the question content £ denotes either an axis-parallel line (uo, e;) 
for some i € {1,2,...,m}, or a diagonal line ¢(ug,v') for some v’ € F”. As described at the beginning 
of Section 7.3.1, axis-parallel lines are specified by pairs (uo,s) € EF x IF, and diagonal lines are speci- 
fied by triples (uo,s,v) € Fp’ x IF, x Fj’. The measurement on question ((ALINE, W), £), for example, 
corresponds to first performing the measurement i” } hckM to get an outcome h € FM, and reporting the 
univariate polynomial f(t) = g_(uo + te;) where £ is specified by (uo, s) andi = x(s). 

Next we specify the POVMs associated with questions of type CONSTRAINT, VARIABLE, and PAIR. 
Questions with these question types have a question content that is a tuple w = (ux, Uz, rx,rz) € (F7)? x 
IF. Given such a tuple, consider the two F2-valued POVMs A = {Aj }per, and B = {Bp }pep, defined as 


Ab = Ting ur È L» (53) 
By = Tee(g (uz) 2)=1] 8 - (56) 


We would like to determine when the two observables O, = Ap — A; and Og = Bo — By commute 
or anti-commute. Towards this we derive alternative expressions for these observables from which their 
commutativity becomes plain from inspection. We begin by inspecting the first matrices on the right-hand 
sides of Equations (55) and (56): 


W — W 
Titr(g.(uw)rw)=b] — È Th 
h:tr(g8n (uw) rw)=b 


= 3 TH (57) 
h:tr((h- ind (uw))-rw)=b 


82 


where (57) follows by Definition 13. As a result, 


a a 
Ter(g.(uw) rw)=0] ~ “[tr(g. (uw) rw) =) 


= D W — 2 q 


h:tr((h- ind (uw))-rw)=0 h:tr((h- ind (uw))-rw)=1 
=% (- 1 tr((h-indm(uw)) rw wW 
h 
= t" (indm(uw)rw) , (58) 


where the last step uses (20). As a result, Equation (58) and Equations (55) and (56) imply that 
Og = tř (ind„(ux)rx)&I, Og = t4(indn(uz)rz) @1. 
Now, let y = tr((indin(ux)rx) - (indm(uz)rz)) € Fo, as in Equation (54). Then by Equation (16), 
O4Og = (—1)7Og0, . 


As a result, y quantifies whether the observables O 4 and Og commute, and therefore whether the measure- 
ments A and B commute. If y = 0, they commute. If y = 1, they anti-commute. We now specify the 
POVMs associated with questions of type CONSTRAINT, VARIABLE, and PAIR, considering separately the 
cases when y = Dory = 1. 


1. If y = 0, for each Bx, Bz € Fo define 


Epes = Apy ` Bez» (59) 
EEX) w 2 An ; pee 2 B, 


The POVM associated with questions of type CONSTRAINT and VARIABLE are defined to be trivial. 


In particular, we define 
CONSTRAINT;,W __ pVARIABLE, w 
Fo,0,0 = Ey =I, 


and the remaining POVM elements in these measurements are set to be zero. 


2. If y = 1, then Oy and Og anti-commute. In this case we define measurements ECONSTRA'NT and 
VARIABLE; w for alli € {1,...,6} andj € {1,...,9} to be those guaranteed by Theorem 7.11. 
Measurements associated with inputs of type PAIR are defined to be trivial. In particular, we define 


EPAIR, w — pene Z Bene =j, 


and the remaining matrices in these measurements are set to be zero. 


This completes the specification of the strategy. 

Now, we show that .” is a value-1 SPCC strategy. It is clearly symmetric and projective. To show that it 
is consistent, we note that all measurements are Pauli basis measurements, which are consistent, or measure- 
ments produced by Theorem 7.11, which are also consistent. The only exception is the PAIR measurement in 
the y = O case, which by Equation (59) is a product of two commuting, consistent measurements, and so it is 
therefore also consistent. To show that it is commuting, we note that for each W € {X, Z}, all (POINT, W), 
(ALINE, W), (DLINE, W) and (PAULI, W) measurements commute as they are all measurements in the 


83 


Pauli W basis. Next, if y = 0 then the CONSTRAINT and VARIABLE measurements commute trivially, and 
for W € {X, Z}, the (POINT, W) measurement commutes with the (PAIR, W) measurement, as they are 
both W basis measurements, and the measurement Ea =A Bg B gz commutes with pone because 
A and B commute. On the other hand, if y # 0, then the PAIR measurements commute trivially, and the 
CONSTRAINT and VARIABLE measurements commute by Theorem 7.11. Finally, the (POINT, X) measure- 
ment commutes with VARIABLE, because both are X-basis measurements, and likewise both (POINT, Z) 
and VARIABLEs are Z-basis measurements. 

It remains to show that .Y is value-1. Consider first the first three tests executed by the decision proce- 
dure in Figure 8. The strategy passes the consistency checks with probability 1 because it is projective and 
consistent. It passes the low-degree checks because it answers those consistently with an honest strategy in 
the classical low-degree test. 

Next, consider the remaining four tests. Fix an w = (ux, uz, rx,rz) and y as in (54). If y = 0, then 
the strategy passes the commutation check with probability 1 by construction. As for the consistency check 
in Item 5, we can write the (POINT, W) measurement as follows: 


(POINT,W),y _ HW _ p(Pair,W),w@ 
FE e(-rw)=tq] = Te(g-(y)rw)=en] © 5 Ew ° BL 

As a result, due to the consistency of the (PAIR, W) measurement, the consistency check is passed with prob- 
ability 1. On the other hand, if y Æ 0, then the strategy passes the Magic Square check by Theorem 7.11. 
As for the consistency check in Item 7, we can write the (POINT, X) measurement as follows: 

(POINT,X),y _ _X — PVARIABLE},W 

te(-7x)=te] T Tte(g.(y)rx)=an] 8 T5 Fag SY" BL 
where the last step is by Theorem 7.11. As these measurements are consistent, this test is passed with 
probability 1. O 


Soundness of the Pauli basis test. We state the soundness properties of the Pauli basis test. The following 
is an adaptation of the self-testing statement in [NW19, Theorem 6.4]. 


Theorem 7.14. There exists a function 
ÔQLD (e, m,d, q) = a(md)* (e + qg” Ae g bma) 


for universal constants a > 1 and0 < b < 1 such that the following holds. For all admissible parameter 
tuples qldparams = (q, m, d) and for all strategies Z = (|p), A, B) for the game O narams that succeed 
with probability at least 1 — e, there exist local isometries ġa : Ha > Hy SHar, OB : Ha > Hp © Hp 
(where |p) € Ha 8 Hg and Han, Hp & (C1)®™ with M = 2") and a state |AUX) € Hy ® Hy such 


that 
1. _||pa ® ba|p) — |AUX) @ |EPR,)°™|| < darp(e,m, d, q), 


2. Letting AX = pa AX o and BY = $pBi gh, we have for W € {X, Z} 


+ (PAULI,W) ae Ww 
An Q Ing” Sgp (Tu Ja” Q Tapp 
s (PAULI,W W 
Taran Q B, ) y join Iyang Q (T Jg” ; 


@M 


where the X syp Statement holds with respect to the state |AUX) ap’ 8 |EPR,) Alp and the answer 


summation is over u € F7. 


84 


We provide a proof of Theorem 7.14 in Appendix A. 

The strategies described in the proof of Lemma 7.13 and in the conclusion of Theorem 7.14 are de- 
scribed in terms of qudit Pauli measurements and maximally entangled states defined over g-dimensional 
qudits. However, it is more convenient for our application of the Pauli basis test to instead deal with qubits. 
In particular, we use the Pauli basis test in the “introspection game” of Section 8, where the players are com- 
manded to sample questions according to a sampler S of a normal form verifier. By definition of normal 
form verifier, S is a sampler over Fz, and therefore it is natural to use a test for qubit Pauli measurements. 

Since we use field sizes q that are powers of 2, Lemma 3.26 shows that the conclusions of Theorem 7.14 
can also be described in terms of testing for maximally entangled states over qubits and qubit Pauli mea- 
surements. 


Corollary 7.15. There exists a function 
ounle, q, m, d) = a(md)*(e + q7 427m) 


for universal constants a > 1 and0 < b < 1 such that the following holds. For all admissible parameter 
tuples qldparams = (q, m, d) and for all strategies Z = (|p), A, B) for the game Ce okt that succeeds 
with probability at least 1 — e, there exist local isometries ġa : Ha > Hy SHar, bp : Ha > Hp ®@ Hp 
(where |p) € Ha Q Hp and Han, Hp S (C?)8M1o81 with M = 2™) and a state |AUX) € Hy D Hp 


such that 
1. ||pa ® p|p) — |AUX) 8 |EPR2)°M1°84|| < dqip(e,m, d, q), 
2. Letting AX = pa Ai 1, and BY = ppBi pp, we have for W € {X, Z} 


4 (PAULLW) aS Ww 

Au 8 Igp” sop (Cu Ja" @ Lae 
p (PAULLW) _ W 

Ixa" 8 By born Tarare ® (Ou Je” 


M1 
where the X 5p» statement holds with respect to the state |AUX) arp @ [EPR2) froe 81 


summation is over u € F7. Here, g denotes the tensor product of single-qubit Pauli measurements 


and the answer 


M logq i 
O & aii 
i=1 j=1 


where (Uj) = (Uij)1<j<logq (using the bijection x : Fy —> F281 from Section 3.3). 


Proof. Lemma 3.26 implies that there are local isometries that map |EPR,)°™ to |EPR2)®™!°84, and map 


the generalized Pauli measurements tT” to @M, me) Oe where x(u;) = (Uj1,---,Uilogq). Combining 
this with the isometries guaranteed by Theorem 7.14, we obtain the statement of the Corollary. O 


7.3.2 Canonical parameters and complexity of the Pauli basis test 


We specify a canonical setting of the parameter tuple qldparams as a function of an integer R. We call 
this canonical setting introparams because we use these in the “introspection” section (Section 8). We then 
give bounds on the complexity of computing the decision procedure and CL functions of the Pauli basis test 
corresponding to the parameter tuple qldparams, as a function of R. 


85 


Definition 7.16 (Canonical parameters of the Pauli basis test). Let a,b be the universal constants specified 
in Theorem 7.14. Let c denote the smallest even integer that is at least (b + a)/b. For all integers R > 4, 
define the tuple introparams(R) = (q, m, d) where 


e q = 2 fork =c[loglog R] +1. 
e m = 2) where 2/ < c[logR] +1 < 2/41. 
e d=1. 


Intuitively, this choice of parameters is such that the Pauli basis test certifies the presence of an M = 2”- 
qudit EPR state (of local dimension q), or equivalently an M log q-qubit EPR state. 


Lemma 7.17. For all integers R > 4, the parameter tuple introparams(R) = (q,m,d) is admissible (see 
Definition 7.12), satisfies 2 > R, and furthermore there exist universal constants a' > 1 and0 < b' <1 
such that the function dg.p(e€,m, d, q) from Theorem 7.14 is at most 


a'(log(R)" - e” +log(R)~") . 


Proof. We verify the admissibility of introparams(R) first: the field size q = 2* is admissible because k is 
odd. We have m|q because for all R > 4 we have 


2) < 2cflog R] < 2log°R < 2". 
Next, we show that R < 2”. Since m = 2/ > (c/2) log R, we have that 
or SR oR 


where we used that c > 2. 

Finally, we show the desired bound on the error function daip(é, m,d, q). First, observe that 2~2"4 < 
g” because k < m. Note that for R > 2, we have m < 2c log R, and furthermore q = ak > log° R. Then, 
we get 


Sotp(e,m,d,q) = a(md) (eè +g + 2-2) (by Theorem 7.14) 
< a(md) (è +29”) (because 2-94 < qb) 
= a(m)*(e’ +29’) (because d = 1) 
< a(2c)* log"(R)(e” + 2q~°) (because m < 2c log R) 
< a(2c)* log" (R) (® + 2(log R)~“) (because q > log‘ R) 
< 2a(2c)" log" (R) (e + (log R)~“) 


a 
2a( 
= 2a(2c)*((log R)" + (log R)*~°’) 
2a(2c)* ((log R)" + (log R)) (because cb — a > b > 0) 
< a’ (log R)” + (log R)™”), 
where a’ is defined as the quantity 2a(2c)*. This completes the proof of the lemma. O 


Lemma 7.18. Fix an integer R > 4, letintroparams(R) = (q,m, d). The integers (q,m,d) = introparams(R) 
can be computed in time polylog R when given the binary representation of R as input. 


86 


Proof. This follows from the definitions of the parameters in Definition 7.16. 
O 


Lemma 7.19. Let R > 4 be an integer, and let introparams(R) denote the parameter tuple specified by 
Definition 7.16. 


DPAUL! 


1. The time complexity of the decision procedure parameterized by introparams(R) is poly(R). 


2. The time complexity of computing marginals of the CL functions Ly as well as the associated factor 
spaces, fort E€ TP"! is polylog R. 


3. The Turing machine description of the decision procedure DP“ parameterized by introparams can 
be computed from the binary presentation of R in polylog(R) time. 


Proof. Finite field arithmetic over F; can be performed in time polylog q, by Lemma 3.18. The parameters 
of introparams(R), which the decision procedure DP*"' implicitly computes given R, can be computed 
in time polylog(R), by Lemma 7.18. The most expensive step in DPA"! is to evaluate the low-degree 
encoding g,(y) where a € EF and y € F”, which takes time poly(M, log q) = poly(R) by Lemma 3.21. 

The complexity of computing the CL functions L+ for types t € 74"! is dominated by the complexity 
of computing the CL functions Latins, LDLine, and Lponr from the classical low-degree test, which takes 
time poly(m, log q) = polylog(R). 

The factor spaces of L depend only on the question type t (of which there are only constantly many), 
and outputting the length-m indicator vectors of the factor spaces takes O(m) = polylog(R) time. 

Finally, the time complexity of computing the description of DPAULI from the binary representation 
of R requires polylog R time, because the checks performed in the decision procedure DPAUL! depend on 
introparams which ultimately depends on R; we assume that the decision procedure computes the parameter 
tuple introparams based on R. Thus the time complexity is dominated by the time to write R in binary. O 


87 


8 Introspection Games 


8.1 Overview 


Consider a normal form verifier V = (S, D) (see Definition 5.33). In this section we design a normal form 
verifier VINTRO = (INTRO, DINTO) (called the introspective verifier) such that in the n-th game VINTRO 
(called the introspection game; see Definition 5.33 for the definition of the n-th game associated with a 
normal form verifier) the verifier expects the players to sample for themselves questions x and y distributed 
as their questions in the game Vy for index N = 2”—this is the “introspection” step. Note the exponen- 
tial separation of the indices of the introspection game versus the original game! Our construction of the 
introspection game generalizes the introspection technique of [NW19]. 

Recall the execution of the N-th game corresponding to the “original” verifier V (see Definition 5.33). 
Letn > 1, N = 2", and suppose that the CL functions of S on index N are LÂ, LB acting on an ambient 
space V. In the game Vy, the verifier first samples z € V uniformly at random and gives each player 
w E {A,B} the question xw = L” (z). In this context, the string z is referred to as the “seed”. The players 
respond with answers a, and dp, respectively, and the verifier accepts or rejects according to the output of 
D(N,xXa,Xp,4a,ap). 

In the introspection game, with some constant probability independent of n, the verifier VINT®° sends 
the question pair (INTRO, A) to player w and (INTRO, B) to the other player W, where w € {A,B} and 
recall that W = B if w = A and W = A otherwise. The verifier expects player w to measure their share 
of the state |EPR)y using a coarse-grained Z-basis measurement whose outcomes range over Lê(V), and 
similarly player W measures the state |EPR) y using a coarse-grained Z-basis measurement with outcomes 
that range over LP (V). If the players perform these measurements honestly, then the outcomes (x4, Xp) are 
distributed exactly according to 1s n, the question distribution of the game Vy. Players w and W are then 
expected to respond with the question Xw and xz that they each sampled, together with strings Aw and az, 
respectively, which are intended to be the answers for the question pair (x4, Xp) in the game Vy. In other 
words, the players introspectively sample the question pair (xA, Xp) and then respond with the question 
itself and an answer for it. 

To facilitate comprehension, we call the players that interact with the introspective verifier V!XT®° the 
““introspecting players”, and the players that interact with the “original” verifier V the “original players”. To 
ensure that the introspecting players follow the above procedure honestly, the introspective verifier VINTRO 
first uses the (binary) Pauli basis test described in Section 7.3 to force the introspecting players to share the 
state |EPR) y. The Pauli basis test also ensures that the players measure 0” and report the outcome honestly 
when they receive questions (PAULI, W) for W € {X,Z}. For v € {A,B} the verifier cross-checks the 
question pairs (INTRO, v) and (PAULI, Z) to enforce that the honest measurement is performed for question 
(INTRO, v). 

The main difficulty in the soundness analysis is to ensure that the answer of player w who received 
question (INTRO, v) depends only on L?(z) and not on any other information about the string z. First 
assume for simplicity that L” is a linear function. As shown below (based on Lemma 8.4), L°(z) can be 
obtained by measuring a specific collection of 7“ observables; namely, the set 


{o%(u):u € ker(L”)+}. (60) 


To prevent the player from obtaining any additional information the verifier needs to enforce that the player 

does not additionally measure any a“ (u) for u ¢ ker(L®)+. (We refer to any such g% as a “prohibited” 
Z 

o^.) 


88 


The introspective verifier achieves this by sometimes sending the READ type question (READ, v) or 
HIDE, type questions (HIDE,, v) to the players. When receiving the READ question, the players are required 
to also measure observables from the set 


{o¥ (r) :r € ker(L”)}, (61) 


which (as shown in Lemma 8.5 below) commute with every Z-basis measurement in (60). On the other 
hand, any prohibited g (u) must anticommute with at least one of the 7*(r), as otherwise u would be in 
ker(L?)+Ł. As a result, honestly measuring the 7X observables of (61) has the effect of preventing the player 
from measuring any of the prohibited 7% observables, so that the answer a can effectively only depend 
on the question L’(z). (In the protocol, the verifier asks the player to measure the function (L°)+(z) in 
the X basis, rather than all of the 7* observables. By Lemma 8.4 the two are equivalent.) Similarly to 
how questions (PAULI, Z) and (INTRO,v) are cross-checked, the questions (PAULI, X), (HIDE,,v) and 
(READ, v) are cross-checked in order to ensure that the honest X-basis measurements are performed for the 
HIDE, questions and the READ questions. 

When the CL functions L?” are ¢-level for £ > 1, the introspective verifier sends one of O(£) different 
hiding questions to the players, chosen at random; together these hiding checks ensure that each of the con- 
stituent linear maps of L?” are honestly measured. Intuitively, these hiding questions “interpolate” between 
questions (PAULI, Z), (INTRO, v) and (PAULI, X) in a way that, for all pairs of questions asked by the veri- 
fier, the honest measurements commute. (See Figure 11 for an overview of the honest measurements.) This 
strong commuting property of the strategy is essential for the oracularization transformation in Section 9 to 
be possible. 

A key property of the introspection game is that the distribution of questions (which include the Pauli 
basis test questions as well as the introspection questions and the hiding questions) is also condition- 
ally linear. This means that the introspection game can be ultimately specified by a normal form verifier 
PINTO — (INTRO, DINTRO) | which is crucial for recursive compression of games. Moreover, while the time 
complexity of the introspective verifier’s decider DIT? remains polynomially related to that of D on index 
N, the time complexity of the sampler S'NT®° is polylog(N) (exponentially smaller), due to the efficiency 
of the Pauli basis test. Finally, SIT? only depends on V through the number of levels 4 of S as well as 
upper bounds on the time complexities of S and D. 


8.2 The introspective verifier 


Let V = (S,D) be an ¢-level normal form verifier. We call V, S, and D the “original” verifier, sampler, and 
decider, respectively. For all original verifier V, we define the introspective verifier V!X™®° in this section. 
We use N = 2” to denote the index of the original verifier V that is simulated by the introspective verifier 
VINTRO on index n. 

The introspective verifier corresponding to V and parameters (À, £L) is a typed normal form verifier 
YINTRO — (SINTRO PINTRO) | sketched in Section 8.1 and specified in detail in the present section (see 
Section 6 for the definition of typed normal form verifiers). In the following descriptions of the sampler 
S!NTRO and decider D!X™®°, all parameters are functions of the index n, the number of levels £ of the sampler 
S, and the parameter A; we often leave this dependence implicit. 

Let R = N^, and let introparams(R) = (q, m, d) denote the parameter tuple specified in Section 7.3.2. 
Note that introparams is implicitly a function of n (since R is a function of n). The associated parameter 
tuple introparams is intended to parametrize a Pauli basis test that certifies the presence of Q = 2” log q- 
qubit EPR states; intuitively, these qubits are meant to serve as the source of randomness for the CL functions 


89 


of the original sampler S. Since we will assume that the original verifier V is A-bounded, the original 
sampler S on index N has time complexity at most R, and therefore the amount of randomness needed is at 
most R. By Lemma 7.17, the number of qubits Q in the EPR states is at least R, and thus the EPR qubits 
can be used to properly introspect the original game. 

Recall that the players in the introspection game are referred to as “introspecting players” and the players 
in the original game are referred to as “original players”. We use the following notation in order to distin- 
guish between questions and answers meant for the introspecting players versus the original players. The 
questions and answers of the introspecting players are denoted by hatted variables (e.g., ĉ and â). Similarly, 
the associated question types are denoted using hatted variables f. The questions and answers of the original 
players in the original game Vy are denoted using non-hatted variables (e.g. x, a, and so on). 


Types and type graph. The type set 7™T®® for the introspective verifier V!NT®° is 
4 
AP INTRO — Ip PAULE) y (({INTRO, SAMPLE, READ} U ( J {HIDE} )) x {A,B}) , 
k=1 


where PAH is the type set of the Pauli basis test, defined in Section 7.3. The type graph G!NT®° is specified 
in Figure 9. 


(DLINE, X) (ALINE, X) 
(HIDE, A) (HIDE, A) (HIDE, A) 


— a 
(PAULI, X) 
VARIABLE} (POINT, X) 
ONSTRAINT: 
G 1 vein, (Pair, X) (HIDE, B) (HIDE ,B) 
PAIR 
CONSTRAINT? VARIABLE3 
(PAIR,Z) (SAMPLE,B) (INTRO,B) 
CONSTRAINT3 VARIABLE4 (PoINT, Z) 
(PAULI, Z) 
CONSTRAINT; VARIABLES ‘ J 


(SAMPLE, A) (INTRO,A) (READ,A) 


VARIABLE 
CONSTRAINT; 6 (DLINEZ) (ALINE, Z) 


VARIABLE7 
CONSTRAINT—6 


VARIABLEg 


VARIABLEg 


Figure 9: Type graph G!‘™®° for the introspection game. Each vertex also has a self-loop which is not drawn 
in the figure for clarity. 


Sampler. We first define a 3-level (T!T®°, GINT®°)-typed sampler IO, which has field size q(n) 
and dimension 3m(n) + 3, where q(n) and m(n) are specified by introparams(R). Note that the di- 
mension matches that of the space V?4U"' of the CL functions of the Pauli basis test parameterized by 
introparams(R), specified in Section 7.3.1. 


90 


Fix n € N. We specify the CL functions of S!NT®° on index n. Since the functions as and poer are 
identical for all n and f, we omit the superscripts A and B. For types ê € PAYL, the CL functions L# are 
given by those specified in Section 7.3.1, parameterized by introparams(R). For types f € INTRO \ PAULI, 
the associated CL functions are defined to be 0-level CL functions (i.e., they are the 0 map). This means that 
for question types such as tf = (INTRO, v) or t = (SAMPLE, v) for v € {A,B} the associated question is 
solely comprised of the question type label. 

Finally, we define the typed sampler S!NT®° to be x( , the downsized sampler (Definition 6.6) 
corresponding to INTO, The typed CL functions associated with the Pauli basis test are 3-level; using 
Remark 4.4 and Lemma 4.11 it follows that S'N™®° is a 3-level typed sampler. 

The following lemma establishes the complexity of the sampler S!™®° as well as the complexity of 
computing a description of it from the parameters (A, £). 


SINTO) 


Lemma 8.1. There exists a 2-input Turing machine ComputelntroSampler that on input (À, £) outputs a 
description of the sampler S™®™? (corresponding to parameters À, £) in time polylog(A, £). Furthermore, 


1. TIME gixreo (1) = poly(n, À, £), and 
2. SINTRO is a 3-level typed sampler. 


Proof. Define the following 9-input Turing machine RawlntroSampler, that does not depend on any pa- 
rameters (so that its description length is constant). On input (A‘, l’, n’, Xi,- --, Xg), the Turing machine 
RawlIntroSampler computes the output of the typed sampler MTO (parameterized by (A, £)) with in- 
put tapes set to (n’,x},...,x¢). In more detail, the Turing machine RawlntroSampler first computes 
introparams(R) for R = 2^". Using Lemma 7.18, this computation takes time poly log(R). Next, de- 
pending on the contents of the last 7 input tapes of RawlntroSampler, the Turing machine evaluates the 
dimension of S!XT®° (which can easily be computed from introparams(R)), or one of the CL functions, 
or returns a representation of a factor space of N?O, If the type passed as input is f € TP" then by 
Lemma 7.19 this takes time polylog(R). If ê € T!NT®°\7?*"' then this can be done in O(log l’) time (to 
read the input type). Overall, RawlntroSampler runs in time poly log(R, ¢’). 

We now define the Turing machine ComputelntroSampler as follows. On input (A, £), ComputelntroSampler 
outputs the description of RawIlntroSampler with the first two input tapes hardwired to A and £, respec- 
tively, yielding the sampler S!®° corresponding to parameters (A, £). Computing this description takes 
O(log A + log £) time. 

The time complexity of S'NT®° follows from the time complexity of RawlntroSampler and the number 
of levels is by construction. O 


Decider. The typed decider Ĥ'NTPO is specified in Figure 10. We explain how to interpret the figure, 
including the notation. (It may also be helpful to review the description of the intended strategy for the 
players in the game, described in Section 8.3.) 


Question and answer format. The decider takes as input a tuple (n, ta, £a,tp, 2B, Âa, ap) where (tw, £w) 
denotes the question for introspecting player w € {A,B}, and Âw denotes their answer. As in the specifi- 
cation of the Pauli basis test, in Figure 10 we include an “answer key” indicating how the players’ answers 
are parsed, depending on the question type. When the question type is from 74", the question and an- 
swer format are as described in Figure 8. When the question type is in T!XT®° \ TJ PAU", the answer format 
is described in the table at the top of Figure 10.°° For each such question, there is an associated variable 


3°There is no question format specification for the question types in T!NT®° \ T PAU“, because the question is solely comprised 
of the question type label. 


91 


Type Answer format 


(INTRO, v) (y,a) E€ V x {0,1}* 
(SAMPLE, v) (z, a) € V x {0,1}* 

(READ, v) (yya) EV xV x {0,1} 
(HIDE%, v) (MEV VAY 


In the following, whenever S or D is called as a subroutine, D!X™®° aborts and rejects if the 
subroutine takes more than N^ time steps. On input (n,t,,%,,tp,%p,aa,ap), the decider 
DINTO first computes the dimension s(N) of V by calling the original sampler S on input 
(N, DIMENSION). The decider rejects if s(N) > N^, or if max{|@|,|@g|} > 3-2” -logg 
where introparams(N*) = (q,m,d). Otherwise, it performs the following tests for all 
w,v € {A,B}. (If no test applies, the decider accepts.) 


i, > test). If ta, i c J PAUL the decider accepts if Dn accepts the input 
ta, ĉa, ÎB, 2B, Âa, Âp). 
2. (Sampling tests). 
(a) If f,, = (PAULI, Z) and îs = (SAMPLE, v), accept if âV = zy. 
(b) Itt, = (INTRO, v), îp = (SAMPLE, 0) accept if Yy = L” (zm) and aw = am. 
3. (Hiding tests). 
(a) If ĉe = (INTRO, v), tg = (READ, v), accept if Yw = ym and dy = az. 
(b) If f,, = (HDE; v), fg = (READ, v), accept if Yw, <e = Yu, <e, and y$ = yx. 
to 


(c) If tw = (HIDE;, v), to = (HIDEk+1, V) for some k € {1,2,...,2 — 1}, accept if 


Yu, <k = Yo, <k» Vis, ge = = Yr <k; Xw,>k+1 = XT, >k+1 7 


L 
and Der = (ER) (Xw,k+1) Where u = ym, <k- 


(d) If £4 = (PAULI, X), tg = (HIDE1, v), accept if = = Cale (ay!) and = 


XG, >1: 
4. (Game test). If t = (INTRO,A) and tg = (INTRO,B), accept if D accepts 
(N, Yw, YT, Aw, am). 


A 


5. (Consistency test). If ĉa = tp, accept if and only if @, = ag. 


Figure 10: The typed decider D!N™®° for the introspective verifier, parameterized by integers A, £, on index 
n. N denotes 2”, V is the ambient space for S, and {L foes A,B} the associated CL functions on index N. 


92 


v € {A,B} that indicates to the introspecting player which original player it is supposed to impersonate in 
the introspection game. 

In the figure, V denotes the ambient space of the original sampler S on index N = 2”. We assume 
that the original verifier V is A-bounded and in particular the running time of the original sampler S (and 
therefore its dimension) is at most R = N^. If the dimension were larger than N4, the decider D!NT®© 
would always reject in the beginning. 

Since V is isomorphic to FS), where s(N) is the dimension of S, the space V is identified in a 
canonical way as the register subspace of FẸ spanned by ¢1,...,@s,7) Where e; is the i-th elementary basis 
vector. For example, if fy = (READ,v), then syntactically the player’s answer is a triple (y, y+,a) in 
FY x FẸ x {0,1}*. We assume that the decider DTR computes the dimension s(N) of the subspace 
V by calling S on input (N, DIMENSION), and if y, y+ are not presented as vectors in the subspace V, 
then the decider rejects. Thus in the analysis we directly consider y, yt as vectors in V. In Figure 10, the 
components of the players’ answers are subscripted by the player index. For example, if player w receives 
question (HIDE;, V), then their answers are denoted by (yw, Y$, Xw). 

The notation used in the “answer key” is meant to give an indication of the intended meaning of the 
players’ answers. We use y to denote a vector that is supposed to be the result of measuring a CL function 
ie yt is supposed to be the result of measuring “dual” linear maps L+ (as used in Step 3 in Figure 10); x is 
supposed to be the result of ~* measurements, and z is supposed to be the result of 7“ measurements. We 
use 4 to denote the introspected answers meant for the original decider D. 


CL functions and factor spaces. For v € {A,B}, let L” = L® denote the CL function for original 
player v specified by S on index N = 2”. For z € V, the decider D!X™®° computes L(z) by calling S on 
input (N,v, MARGINAL, £, Z). 

For y € V and k € {1,...,¢} we define register subspaces V? (y) by induction on k. For k = 1, V? (y) 
is the first factor space?! of L? and is independent of y. Suppose V? (y) has been defined for all j < k. Then 


we define the marginal space V2,(y) = Dii | V? (y), and define V? (y) as the k-th factor space V? „ of L? 


with prefix u = L? (y). We ‘iso define ve, ie V? +14), and V2,(y) to be the complementary register 
subspace to V2,(y) within V. 

The decider D!XT®° computes factor spaces V? (y) from y € V in the following sequential manner: 
first, the indicator vector for the factor space VP is computed by calling the original sampler S on input 
(N,v, FACTOR, 1,0). Let yı denote L] (y). Then, for j € {2,...,€}, the factor space Vv? (y) is computed 
by calling S on input (N,v, FACTOR, j, ¥<j—1), where y<j-1 = Le, 1(Y)- 

Finally, Lemma 4.6 implies that the CL function L” gives rise to functions {L? 
prefixes u, Lẹ „ is a map from V£, to V;’,. 


} where for all k and 


ku 


Detailed explanation of the steps of D'N'™®°. We give more details on the implementation of decider 
D'NTPO specified in Figure 10. 


1. The decider D!N™®° first checks that the answers are not too long; the maximum-length answer 
should be a tuple (y, y+,a) in response to question (READ, v) where y, y+ € V and a is an answer 
intended for the original decider D on index N (which we assume runs in time at most N^), or a tuple 
(y,y+,x) € V x V x V in response to question (HIDE;, v). Either way, the maximum answer length 
should be 3Q = 3-2” -logq = poly(R) bits long. This check is necessary in order to ensure that 
the decider D!X™®° halts in time that is a fixed polynomial of R = N^. 


31See Lemma 4.6 for the definitions of factor spaces of a CL function. 


93 


2. If the question types for the players are both in TP4U"!, INTRO executes the decision procedure DPAUL! 
for the Pauli basis test parametrized by introparams(R) = (q,m,d) (see Section 7.3.1 for the defi- 
nition of DPAU), The Pauli basis test is intended to ensure that the players share at least Q > R 
EPR pairs, where Q = 2” log q. This number of EPR pairs is chosen so that, under the assumption 
that the original verifier V is A-bounded (see Definition 5.32), the time complexity of S on index N 
(and therefore its dimension) is at most R = N*. When analyzing the Completeness, Soundness, and 
Entanglement properties of the introspective verifier V!'N™®°, we assume that the original verifier V is 
A-bounded (see Theorem 8.3). 


3. In Step 2a of D!N™°, player w € {A,B} receives question (PAULI, Z) and player W receives question 
(SAMPLE, v), for some v € {A,B}. According to the answer key, player w is expected to return an 
answer ay €E FS and player W is expected to respond with a pair (ym,aw) € V x {0,1}*. Thus, 
the dimension of answer Âw may be larger than that of yz; this is the reason that Step 2a checks 
consistency between yz and the projection of ay to V. 


4. In Step 2b of DIN™®°, player w € {A,B} receives question (INTRO, v) and player W receives question 
(SAMPLE, v). As specified by the “answer key”, player w responds with (Yw, 4w) E€ V x {0,1}* and 
player W responds with (zw,amw) € V x {0,1}*. The decider DATO Checks that dy = am and 
Yw = L” (zm) where L?” denotes the CL function for player v specified by S for index N = 2”. The 
CL function is computed by calling S on input (N,v, MARGINAL, £, z). 


5. In Step 3b, D!NT®° checks that the answer of player w who received (HIDE,, v) is consistent with the 
answer of player W who received (READ, v). One of the checks is that yw,<¢ = Yu,<¢ ; these are, 
respectively, L? (yw) and L? (yz). 


6. In Steps 3c, the vectors Xw, +441 and xm, +441 denote the projections of Xw and xy to the factor spaces 
V? 41(Yw) and V2, , (ya), respectively. Similarly, Vee ;41 denotes the projection of yz; to the factor 
space Vý, (ym) and xXw,k+1 denotes the projection of xw to the factor space V? , (ya). We emphasize 


that the latter factor space depends on ym and not Ja 

The decider also has to compute (Lg ,4 i a (Xw,k41). According to Definition 3.11, this requires 
specifying a basis for ker(L +. To compute the value, the decider performs the following 
steps: 


eee ay 


(a) Call S on input (N,v, FACTOR, k + 1, ym,<ķ) to obtain a subset H = {hy,...,hn} of the 


(N) 


canonical basis for F; that is a basis of the register subspace V,’, (yw). 


(b) Fori € {1,2,...,m} compute c; = S(N,v,LINEAR,k + 1, ym, <x,h;). Compute a matrix 


representation M for L? TE in the basis H, whose columns are the vectors c1,...,Cm as 
1IO,< 


elements of Vp,- (yv). 


(c) Using a canonical algorithm for Gaussian elimination, compute a basis F for ker( M). 


(d) Compute the canonical complement S of F, as in Definition 3.6. S is a basis for ker(L? 4, pa 


(e) To compute (L L on input Xw,k+1, compute the canonical linear map with kernel basis 


Vv 
k+1, ym, a 
S (see Definition 3.10) on input Xw, k+1- 


32 This is because it is the W player who receives the (HIDE, +1, v) question, and in the “honest” strategy (described in Section 8.3) 
they are supposed to measure k registers to obtain a sequence of values (ym,1, ¥u,2,---, Ym,k) which specify the (k + 1)-st factor 
subspace, whereas the w player is only supposed to measure k — 1 registers. 


94 


7. In Step 3d, the player w that receives (PAULI, X) is expected to return an answer Âw in FY. Part of 
this step checks that the projection of Âw to V1 is equal to xm, +1 (which is the projection to V.1 of 
the third component of the answer triple of player W that receives question (HIDE, 7)). 


8.2.1 Complexity of the introspective verifier 


In this section we determine the complexity of the introspective verifier. The following lemma establishes 
the complexity of the decider D!XT®° as well as the complexity of computing a description of it from the 
parameters (A, £) and the description of the original verifier V = (S, D). 


Lemma 8.2. There exists a polynomial time Turing machine ComputelntroDecider that on input (V, A, £) 
outputs a description of a Turing machine where: 


1. Its description length is at most poly(A, £). 
2. Its time complexity is at most TIME pro (N) = poly (2^", £). 


Furthermore, if |V| < A, then the output of ComputelntroDecider is the introspective decider pinto 
corresponding to verifier V and parameters (À, £). 


Proof. Define the following 10-input Turing machine RawlntroDecider, which is a universal Turing ma- 
chine that computes the same function as the introspection decider D!NT®°, except it also takes as in- 
put extra inputs to specify the sampler, decider, and the parameters AÀ, £ that are used in D!NT®°, Since 
RawIntroDecider is a universal Turing machine, its description length is constant. More formally, on input 


/ $ / 1 1 / / 
(V Apl n Xir Xares); 


the Turing machine RawlntroDecider computes the output of the decider Ô™™® corresponding to V’, A’, 2 
with input tapes set to (n’, xi,...,x/). In more detail, the Turing machine D!‘®° first computes introparams(R) 
for R = 24", Using Lemma 7.18, this computation takes time polylog(R). It then executes the tests de- 
scribed in Figure 10. Write V’ = (S’,D’). The complexity of performing the entire procedure is subsumed 

by the complexity of the following steps: 


1. Running the decider DPAU"', which takes time poly(R) by Lemma 7.19; 
2. Running the decider D’ (on index N’ = 2") for at most 2”’”" steps; 


3. Running the sampler S’ (on index N’) in order to compute the dimension s( N’) and the marginal and 
factor spaces, and the CL functions as described in Section 8.2. S’ is called at most poly(s(N’), 7’) 
times; due to the timeout, each call takes time at most Qn" 


4. Computing (L? „)+ (xwx) in Step 3c. This only requires to perform Gaussian elimination and other 
simple finite field manipulations that can be implemented in poly(s(N’)) time. 


All other tests are elementary. Thus the time complexity of RawlntroDecider is poly(R, é’). Note that 
the bound is independent of V’: due to the abort condition in the definition of D!™®°, the Turing machine 
D!NTRO aborts if the runtime of S’ or D’ is larger than 24”. 

We now define the Turing machine ComputelntroDecider: on input (V, A, £), it first determines whether 
the description length |V| of the verifier is at most A.*? If it isn’t, then it sets V’ = (S’,D’) where S’, D’ 


33This can be done by keeping a counter of O(log A) bits, and incrementing it as the input V is being scanned through. If at any 
point the counter exceeds A, the scanning stops. 


95 


are the trivial Turing machine whose description is 0. Otherwise, ComputelntroDecider sets V’ = V. Note 
that by construction, the description length of V’ is at most A, and that the time complexity of computing 
the description of V’ is polynomial in |V| and log A. 

Then ComputelntroDecider outputs the description of RawlntroDecider with the first three input tapes 
hardwired to V’, A, and £, respectively, yielding the decider D!X™®° corresponding to the verifier V’ and pa- 
rameters (A, £). Computing this description takes poly(|V |, log A, log £) time. Thus overall the complexity 
of ComputelntroDecider is poly(|V |, log A, log £), which is polynomial in its input length. 

The description length of D!'®° is simply the description length of RawlntroDecider (which is a uni- 
versal constant), plus the description length of V’ (which is at most A), plus the description lengths of A 
and £, which altogether totals poly(A, £) as desired. The time complexity of D'N™®° follows from the time 
complexity of RawlntroDecider, which we established is poly(R, £) = poly(24", £). 

Finally, we note that if |V| < A, then by construction the output of ComputelntroDecider is the decider 
D!NT®O corresponding to the verifier V and parameters (A, £). oO 


8.2.2 Introspection theorem 


We are now ready to state the introspection theorem. 


Theorem 8.3 (Introspection theorem). There is a polynomial time Turing machine ComputelntroVerifier 
which takes as input a tuple (V, À, £) for À, £ € N and returns the description of the introspective verifier 
VINO, For all l € N, V!NT®° is a 5-level normal-form verifier with complexity measures 


TIMEsnmo (n) = poly (n, À, Oy 
TIM Epiro (n) = poly (27 L); 
|DINT®°| = poly(A, £). 


Moreover, for all £, there exist constants a > 0, O < b < 1 (depending on £), and a function 6(€,n) = 
a((An)*e° + (An)~*) such that forall n € N and £ > 0 the following statements hold if V is a A-bounded 
L-level verifier. 


1. (Completeness) If Von has a projective, consistent, and commuting (PCC) strategy with value 1 then 
VINTRO also has a PCC strategy with value 1. 


2. (Soundness) If val* (VINT®°) > 1 — e, then val* (Van) > 1 — d(e,n). 


3. (Entanglement) The entanglement bound & defined in Definition 5.12 satisfies 


eye = £) > max {e(Vm, 1— 6(e,n)), (1 L 5(e,n)) rl l 


Proof. The Turing machine ComputelntroVerifier does the following on input (V, £): it first computes 


SINTRO — ComputelntroSampler(A, £) , 


DITO — ComputelntroDecider(V, A, £) , 
using Lemmas 8.1 and 8.2, runs the detyping procedure from Definition 6.17 to compute 


pINTRO — Detype(VINT®°) 


96 


where INTRO — (CINTRO, DINTRO) and then outputs the resulting detyped verifier. This takes time poly(|V| , log A, log £), 
which is polynomial in the input length of ComputelntroVerifier. The complexity parameters of the typed 
sampler S!NT®° and typed decider D!X™®° follow from Lemmas 8.1 and 8.2. The complexity parameters of 
the detyped verifier V!NT®° then follow from Lemma 6.18. As V!NT®° is a 3-level verifier, VINT®° is 5-level 
as detyping will increase the number of levels by at most 2. 
The completeness part of the theorem is proven in Section 8.3 and the soundness and entanglement 
properties are proven in Section 8.4. O 


8.3 Completeness of the introspective verifier 


In this section we establish the completeness property of the introspection game: if for N = 2”, Vy has 
a PCC strategy (see Definition 5.11) with value 1, then so does VINTRO, For this we describe the actions 
that are expected of the players in the introspection game (i.e. the “honest strategy”). We first prove several 
preliminary lemmas that will be used in both the completeness and soundness analysis. 


8.3.1 Preliminary lemmas 


The lemmas in this section are stated for general fields F and generalized Pauli observables and projectors, 
although in the application to introspection games we use F = F2, w = —1, and qubit Pauli observables 
and projectors. Furthermore, the Pauli observables T™ (v) and projectors 7” act on CF for some integer k 
(in our application, we set k = R). 

The following lemma generalizes Eq. (21). 


Lemma 8.4. Let L : F* — FF be a linear map, and let W € {X,Z}. For each a in the range of L, let 
Ug € FF be such that L(g) = a. 


1. For eachv € ker(L)+, 
me) =), o 


ack 
2. For all a in the range of L, 
W _ — tr(v-uq) W 
Tiria) = E w T (U). 
EES v~ker(L)+ l ) 


Proof. Let V denote the image of IF‘ under L. Leta € V, v € ker(L)+. For all u,u' € Lt (a), we have 
that u — u’ € ker(L) and thus tr(u - v) = tr(u’-v). As a result, using (20), for any v € ker(L)+ it holds 
that 


"ej= patirs] L anri = yo a = o e ea 2 


ucF*k aEV ucL-! (a) aceV ack 


where in the last equality we used that for a ¢ V, the projector Th Bea vanishes. This shows the first item. 


97 


For the second item, 


|I 
es 
a 
a 
= 
Š 
q 
= 
z 
Sia” 


E (w tr((ua+u) DW (0)) 
ucker(L) ont 


| ker(L)| ( —tr(u-v) —tr(ug-v)-W ) 
= o o E w w T (U 
|F*]| > (En ) ( ) 


£ E g7 Eta) W (v) 
v€ker(L)+ 


where the second equality follows from (21) and the last uses the fact that |IF*| = | ker(L)|| ker(L)+], as 
shown in Lemma 3.4, and Lemma 3.25 applied to the expectation over u. 


Lemma 8.5 (Commuting X and Z measurements). For all linear maps L, R : F* — F¥ such that 
ker(R)+ C ker(L), 
the measurements {Tho} beri and {TR()<a] tert commute. 


Proof. Let b,d € Fx. If either b is not in the range of L, or d is not in the range of R, then at least one of 
Th (=b) OF TR (.)=d] is 0, and thus the operators trivially commute. Otherwise, both b and d are in the range 


of L and R, respectively. Let ay € L~1(b), co € R-'(d). By Lemma 8.4, 


X = — tr(v-co) X 


For any v € ker(R)+, by assumption v € ker(L), so for u € ker(L)+ it holds that u - v = 0. Thus Tt“ (u) 
and t*(v) commute, and it follows that T (= and TiR(.)=a] Commute as well. 0 


8.3.2 Completeness of the introspective verifier 
Proof of the completeness part of Theorem 8.3. We analyze the completeness of the typed verifier Y!NT®°; 
the corresponding completeness of the detyped verifier VNT? = Detype(V!NT®°) follows from Lemma 6.18, 


and the fact that the type set T!™®° has size O(£). 


Completeness. We first show completeness. Let n > 1 be an index for V!XT8° and N = 2” be the 
corresponding index for V. Since we assume that the verifier V is A-bounded, we have that 


max {TIMEs(N), TIMEp(N)} < N^ =R. (62) 


This assumption on the time complexity of V ensures that D!X™®° never aborts due to a timeout. Let L” = 
LƏN denote the CL function of the original sampler S on index N corresponding to player v € {A,B}. Let 


98 


R = NÌ, and let introparams(R) = (q,m,d), as in Section 8.2. Set Q = Mlogg where M = 2" ; this 
represents the number of qubits that are certified by the Pauli basis test parameterized by introparams(R). 

Let Y = (|AUX),A,B) be a PCC strategy for Vy with value 1. We first construct a PCC strategy 
INTRO for the typed verifier VINT®° with value 1. We then conclude using Lemma 6.18. 


v (0 U 

vi V5 V3 AUX 
(INTRO, v) Of, of, of, A” f 
(SAMPLE, v) oF o” g” ATB 

Tee Loox ZEX x/px 
(READ, v) OF Or oT, O Tae A*/B 
HIDE3, 0 On o~ gZ ož g% I 
( 37 ) ly Lk ia LA ie 

ae X X 
(HIDE?2, v) OF Or Orn r I 
(HIDE:, v) on g“ g” I 


=] 


The left-most column denotes the introspection/hiding questions that a player may receive. The 
top row denotes the registers corresponding to the factor spaces of a CL function L? (we note that 
the partition of the registers depends on the prefixes), as well as the register corresponding to the 


state |AUX) coming from the original PCC strategy .7. We use OL, as shorthand for One =z] 
Xj g: 


x . A symbol I means that the register is left unmeasured. 


[(L? 


and similarly on for 7 
j DX<j 


(=x) 


Figure 11: Summary of the honest strategy .Z!NT®° for VINTO for a 3-level sampler. 


Remark 8.6. Note that by definition in VI™® the players receive questions (x,y) that are sampled ac- 

cording to the distribution P meg associated with the downsized typed sampler x(S'N™°). Using the 

definition of the downsized typed sampler, Definition 6.6, and Lemma 4.12 the distribution is identical to the 
INTR 

distribution He Sn „ UP to the bijective mapping x. This mapping can be computed by the players them- 


. ‘ I 
selves. Therefore, we construct a strategy for players that receive questions from es w and a strategy 


GINTRO 


SINTRO | n 


The strategy -ZINT uses the state |EPR2)®(2+) @ |Aux), where recall that |EPR2) = 


for questions from y follows immediately. 


<5 (|00) + 
|11)). For all register subspaces K C F$ (see Definition 4.1 for the definition of register subspace), when- 
ever we refer to “register K”, we mean the qubits of |EPR2)°2 corresponding to K (see Section 3.5). The 
most frequent register subspace we consider is V, spanned by e1,. . . , €s(n). We write V for the complement 
of V, i.e. the register subspace spanned by es(N)+1, - - -, €Q. (Note that s(N) < R by assumption (62), and 
by Lemma 7.17 we have R < M < Q.) Then |EPR2)®& = |EPR)y Q |EPR)v. 

Let .ZPAUU be the honest Pauli strategy with respect to introparams defined in the proof of Lemma 7.13. 
Using the isomorphism between C^ and (C?)®!°84 specified by Lemma 3.26, the measurements of .ZP4UU 


99 


can be treated as qubit Pauli measurements acting on the maximally entangled state over qubits. Furthermore 
the questions and answer labels of the measurements can be viewed as binary strings using the bijection 


K: Fy > Ee specified in Section 3.3. For a question with a type from 7 PAU"! 


the player measures 
the shared state |EPR»)®(2+") using the measurements specified by .WP4™, and reports the measurement 
outcomes. When a player receives questions of type ê € T!NTR° \ TPAUL!, they perform measurements 
described as follows. (Below, whenever we write a Pauli operator 0! the register on which the operator 
acts should always be clear from context, and is implicit from the space in which the outcome a ranges.) 
The reader may find it helpful to consult Figure 11 to see a summary of the honest strategy -%!NT®° for the 


special case when £ = 3. 


(INTRO, v): The player performs the measurement 


{o=} (63) 


to obtain an y € V. Intuitively, the player has now introspectively sampled the question y for original 
player v in game Vy. The player then measures |AUX) using player A’s measurement {4} } from Z 
if v = A and using player B’s measurement {Bz} if v = B to obtain an answer a. The player replies 
with (y,@). 


(SAMPLE, v): The player measures their share of |EPR) y in the Z basis to obtain z € V. Using this, they 
compute the question y = L°(z). The player then uses player v’s strategy and question y to measure 
|AUX) and obtain outcome a. The player replies with (z, a). 


(READ,v): The player first performs all measurements as in the (INTRO, v) question for player v and 
records the outcomes as y € L(V) anda € {0,1}*. For j € {1,2,...,@}, the player measures their 
share of the state |EPR) y with the measurement 


X 
UTET (64) 


to obtain y+ = yt Spee oe y$. Here G is shorthand for the function (L? Ta defined in Item 6 of 
the decider description in Section 8.2. (That these operators are simultaneously measurable with the 
measurement in (63) follows from Lemma 8.5 and the fact that ker(L;-)* = ker(L;) by Lemma 3.4 


and the definition of L} in Section 8.2.) 


The player measures |AUX) with player v’s measurement strategy in Z for question y to obtain a and 
replies with (y, y+,a). 


(HIDEx,v): The player performs the following sequence of measurements: first measure {Of o()= ar on 
1 TR 
register V; to obtain y1. Then, use y; to specify the second linear function L5 T (-) and measure 


register Vy i using {0 Ga to obtain y2. This process continues until the (k — 1)-th linear 


Z 
v 
[L y 


: ; a 
(-) has been measured to obtain y;_ in factor space Vč, Jai 


vU 
map Lia, Y<k-1 


ee Yk-1:- 
Next, for j € {1,2,...,k} the player measures 


. Lety = y1 + Y2 + 


(Toy f 


100 


where L} denotes the linear map (L? ya as in the case (READ, v). Lety+ = y} +yy +- +yŁ, 
s= 
where each yF is a vector in the factor space Vyar Finally, the player measures register V? (y) 
10S] 


using Lies Bi to obtain outcome xx. Let x = xx. The player replies with (y, yt, x). 


By definition, when player w performs the honest measurement for question (INTRO, A) and player W 
performs the honest measurement for question (INTRO, B), the joint outcome (y, y’) has distribution 4s, y. 
In this case, the players play according to strategy .Y and succeed with probability 1. In all other cases, 
it is straightforward to verify that the players succeed in all tests performed by D!N™®° (Figure 10) with 
probability 1. As a result, the value of this strategy is 1. 

The strategy .7!NT®° is projective by construction. It is also consistent because of the assumed consis- 
tency of the strategy .Y as well as consistency of the honest Pauli strategy PAUH, Furthermore, note that 
S'NTRO only calls Z for both players on question pairs such that both types are in {INTRO, SAMPLE, READ}. 
In all these cases, -X is called on a pair of questions (y, y’) distributed as (L? (z), L” (z)) for v, vo! € {A,B} 
and z uniform in V. When v Æ v’, any such pair by definition has positive probability under 5, n, and so 
by assumption the associated measurements from . commute. On the other hand, when v = v’, then the 
players apply the same measurements from .”, and because .Y only uses projective measurements, their 
measurements commute as well. Examining all other cases, it follows by direct inspection that the strategy 
commutes on all question pairs whose corresponding types appear as an edge in the graph G!NT®°, Thus the 
strategy commutes with respect to the support of the distribution y SINtRO, + O 


Remark 8.7. For future reference, we note that on any input (V, A, £), the Turing machine ComputelntroVerifier 
always returns a normal form verifier V'NT© = (SINTRO, DINTRO) — even if V itself is not the description 
of a normal form verifier. This is because, for any two integer À, l, ComputelntroSampler(A, £) returns a 
sampler with field size 2, and for any V = (S,D),A and £ the decider DNO specified in Figure 10 takes 

7 inputs and always halts with a single-bit output, even if S or D themselves do not halt. 


8.4 Soundness of the introspective verifier 


The main result of this section is the soundness part of Theorem 8.3. 


8.4.1 The Pauli twirl 


A key tool in the proof of soundness is the Pauli twirl. In this section we introduce the Pauli twirl and 
establish several of its properties. The section closely follows Sections 8 and 10 of [NW19]. 
To begin, we define the twirl with respect to an arbitrary distribution over unitaries. 


Definition 8.8 (Twirl). Let u be a probability distribution over a finite set of unitary matrices. Then for any 
matrix A, the twirl of A with respect to y, denoted 7, (A), is defined as 


F(A) = E (UAL) 


In the next two lemmas we consider the Pauli twirl, in which the distribution y is over subsets of Pauli 
observables. First, we show how the Pauli twirl acts on Pauli matrices. Then, using this, we derive an 
expression for the Pauli twirl applied to general matrices. 


101 


Lemma 8.9 (Pauli twirl of Pauli matrices). Let V be a subspace of F*, let W € {X,Z}, and let u be the 
uniform distribution over {t' (w) : w € V}. Let W! # W and u € F*. Then 


a" (u) Fwev-, 
0 ifugV-. 


F(t (u)) = 


Proof. Letu € F*. Letc = 1 if W = Z and let c = —1if W = X. Then 


Fy (u)) = E (a (ae u") 


= E (wot 2) (ujt (zc (z)*) 


z~V 
= ( E wetea) a” (u) f 
z~V 
The lemma now follows from Lemma 3.25 and the fact that c - u € V+ if and only if u € V+. O 


Lemma 8.10 (Pauli twirl of general matrices). Let V = F*, and let L : V — V be a linear map. Let Ç be 
the uniform distribution over {t“(z) | z € V} and x the uniform distribution over {t* (x) | x € ker(L)}. 
Let M be a matrix acting on CY & Ha, where Ha is a finite dimensional Hilbert space. Then there exist 
matrices { MY tye L(v) acting on Ha such that the twirl of M with respect to Ç and x is given by 


(Fo T 2 14)(M) = È TO= @ MY. (65) 
ye 


Moreover, if we apply (65) to each element of a POVM measurement {M4}, then for each y € V, the set 
{Mi} also forms a POVM measurement. 


Proof. The collection {t*(x)t~(z)}x,zev forms a basis for the complex linear space of matrices acting on 
C”. As a result, we can write 


M= Dea Z)@Myz, 


x,ZEV 


for matrices M, , on Ha. We now use Lemma 8.9 to compute the twirl first with respect to Ç and then with 
respect to Ç and x: 


(Z2 14)(M) = D T(t” (x))T Lage )® Mrz = eG )® Moz, 


x,zEV zEV 
(7.0%, @la\(M) =) Z(t z)) 9M = )) 27) @ Moz. 
zeV z€ker(L)+ 


For all y € V, let uy denote an arbitrary element of L~!(y) if y is in the image of L; otherwise, set uy = 0. 
Expanding T(z) using the first part of Lemma 8.4, 


> t” (z) ® Moz = D Yoa" ee j=] © Moz 


z€ker(L)+ zEker(L)+ Y 


=Lýonel E ot?Moz). (66) 
y 


z€ker(L)+ 


Equation (65) follows by setting MY = Yeo cer(L) 4 cyt (uy 2) Mo, z- 


102 


For the “moreover” part, note first that whenever M > 0 it holds that any twirl satisfies 0 < 7%,(M). As 
aresult, each matrix MY must be positive semi-definite due to Equation (66) and the fact that the {ti nea } y 


matrices are orthogonal projections. Next, suppose {M,} is a POVM measurement, and write N = )°, Ma 
for the identity matrix. Then by linearity, for each y € V, Z, MZ = NY. In addition, Noo = I, and 
N, z = 0 otherwise. As a result, NY = Noo = I, and so {M7} also forms a POVM. oO 


In the next few lemmas we derive a sufficient condition for a measurement to be close to its own Pauli 
twirl, namely that it satisfies certain commutation relations with the Pauli basis measurements. 


Lemma 8.11 (Commuting with Pauli basis implies commuting with Pauli observables). Let V,.A be finite 
sets and D be a distribution over X. For each x € X, let Vy be a register subspace of V = F*, and let 
Ly : Vy > Vy be a linear map. 

Consider a state |p) = |EPR)y ® |AUX), where |EPR)y € Ha ® Hp, for Ha, Hg = CY, is defined 
in Definition 3.22 and |AUX) € Hy ® Hp is arbitrary. For each x € X, let {M} yac 4 be a measurement 
on Ha Hy. LetW € {X,Z}. Then the following are equivalent on the state |): 


e On average over x ~ D, 


e On average over x ~ D and v drawn uniformly from ker(L;)+, 
[Mz, (T" (0) 8 78 Iy)] 8 B =: 0. 
In the equations above, for each x € X we decompose Ha = C¥* & C™ and interpret Th, esi and t™ (v) 
forv € ker(L,)+ as acting on C*%. 
Proof. For x € X, a € A, y in the range of Ly, and v € F* define 
May = [May (TO= ® 78 Ly)] 8 bs, 
Ai(v) = [Mz, (t (0) @ 8 Iy)] 8 I. 


By the first item of Lemma 8.4, for each v € ker(Ly)+, 


= Jo o"? Ay, 


YEVx 


where for every y in the range of Ly, uy is a fixed element in Ly'(y). The expression for the closeness of 
A* to 0 on average over x ~ D and v ~ ker(Ly)+ (i.e. the second quantity of the Lemma statement), is 


equal to 
ei 


Lome ae 


and can be expanded as 


E E LoLo MOAR S'AL |p) . (67) 


x~D v~ker(Ly)+ 


103 


If y # y', then by definition Ly(uy) # Lx(uy), and so u; 
V = ker(L,)+) thus implies that (67) equals 


BEUL IP) = Bd las yl)" 


which is the closeness of A; to 0 on average over x ~ D. Thus, A*(v) %: 0 on average over x and v if 
and only if Any Xe 0 on average over x. O 


— uy is not in ker(Ly). Lemma 3.25 (with 


Lemma 8.12 (Commuting implies twirl). Let X be a finite set and D a distribution on X. For each x € X, 
let {M3} be a POVM on Ha, and let uy be a distribution over unitary matrices acting on Ha. Suppose 
that on average over x ~ D and U ~ ux it holds that 


[Mz,U"] ® Ip &. 0, 
where the commutator is evaluated on a state |p) € Ha ® Hx. Then on average over x ~ D, 
M3 2 Ip Se Fu,(Mz) S Ie. 
Proof. Observe that 


E L| (2 (M3) — Ma) & Tal) P= E Ell E, (U[Mz,U*]) @ Ip|p) |. (68) 


Applying Jensen’s inequality, the right-hand side of (68) is at most 


E E }|(U[M:}u]) s Bly) = E E X |[M: u] @ rp]? 
x~ Loa! ~ux q x~DU~Hx a 


using the unitary invariance of the Euclidean norm. This last quantity is O (e), by assumption. O 


Lemma 8.13 (Commuting with each implies commuting with both). Let ¥ be a finite set and D a distri- 
bution on X. For each x € X, let {Mx} be a POVM on Ha and let x1, }tx,2 be two distributions over 
unitary matrices acting on Ha. Suppose that for each i € {1,2}, on average over x ~ D and U; ~ ply, 


where the expression is evaluated on some state lp) E€ Ha © Hp, where Hg = Ha. Suppose further that 
on average over x ~ D and U2 ~ Hx), 


Ut @ Ip ~: I4 @ UD. (70) 


(The corresponding statement for i = 1 is not needed.) Then on average over x ~ D, Uy ~ fy,1, and 


Up ia Hx D 
[Mz, U{U3] ® Ip ~ 0. 


Proof. The claim follows from the following sequence of approximations: 


M<UtUut Q Ip ~e MŽUŤ @ Up (by (70)) 
~; UT MŽ 9 Up (by (69) for i = 1) 
~e UJMU} @ Ip (by (70)) 
~: UUM? Q Ip, (by (69) for i = 2) 


where each step also uses Fact 5.19. This is equivalent to [Mz, ut ut ] ® Ip %e 0, completing the proof. O 


104 


The following is a slight generalization of [NW19, Fact 4.25], and we give a similar proof. 


Lemma 8.14 (Close to sub-measurement implies close to measurement). Let X, A be finite sets and D be 
a distribution on X. Suppose that for each x € X, {A*}ac, is a projective measurement and {B¥ yac A is 
a set of matrices such that each B> is positive semidefinite and $}, B¥ < I. Suppose {C¥ yac 4 is a POVM 
such that C3 > B% for all x and a. Then if, on average over x ~ D, A% ~ By, then, on average over 
x~ D, AX X 1/2 CG7: 


Proof. By the triangle inequality, 
EI (Az — Capi <2E) I Az — Ba )lp) |? +2E L I(Ca = Baly). 
The first term on the right-hand side is O(¢) by assumption. For the second, 
E DiM(Co— BOIDI? = E Lp(a — Ba) Iy) SE LICC = Ba) I) 
1- ED (VIB) < 1- ELUI) 


where the middle inequality uses 0 < C¥ — B* < I forall x,a. Write 1 = Ex 1, (|(Az)*|W), which holds 
because A is a projective measurement. Then 


EY (|((A3)? — (BIDI) = R( ELPA + B3)(Az — BF)|y)) 
< E, [EIA BOIDI, [L IAF- BHI 


a 
where the first equality follows from the fact that Až and B* are Hermitian. For each x € X the first square 
root is O(1). This allows us to move the expectation into the second square root by Jensen’s inequality. The 
result is O(e!/?) by assumption. O 


Now we put everything together to show the main result of this section. 


Lemma 8.15. Let X, A be a finite sets and D be a distribution over X. For each x € X, let Vy be a register 
subspace of V = F*, let U, be a register subspace of Vx, and let Ly : Uy — U, be a linear map. 

Consider a state |p) = |EPR)y ® |AUX), where |EPR)y € Ha ® Hp, for Ha, Hg = CY, is defined 
in Definition 3.22 and |AUX) € Ha ® Hp is arbitrary. For each x € X, let {Mia tyes, acA bea 
projective measurement on Hy, ® Hy. Suppose that on average over x ~ D the following conditions 
hold. 


(Consistency): (M; ® z Ti, =i Q IZ ® Iy) & lB %0, 
(Commutation): Me & iz P T e ir -& Iy] & lB =%:0, 
[Miya Q I, Tay] Q me Ix] Q Ip & ~Ne 0 . 
Here, the projector Th, (.)=y] acts on the register subspace Hu, and Ux and V; denote the complementary 


register subspaces of Uy and Vy, respectively, within V. Furthermore, the answer summations are over all 
y,z E€ Ux anda E€ A. 


105 


Then for each x € X andy € Ux, there exists a POVM measurement M ed on Hy,\uU, @Hy 
such that on average over x ~ D, 


(Mj. 9 Kz) 8 Ip Man (o= @ Mi" ® Iz) SB: 


Proof. For each x € X, let Çy be the uniform distribution over U, and 7, be the uniform distribution over 
ker(L,.). Applying Lemma 8.11 to the first commutation assumption (letting “L,” in Lemma 8.11 be the 
identity map on Uy, and letting “D” be the uniform distribution over ker(L,)+ = Uy), we get that on 
average over x ~ ¥ and v ~ Çy, we have 

[My 2 @ kz, T“ (0) 9 Ig @ Iy] @ Ip ~0. 


Applying Lemma 8.11 to the second commutation assumption (letting “Ly” in Lemma 8.11 be the ie in the 
statement of Lemma 8.15 and letting D be the uniform distribution over ker(L})+, which is ker(L,) by 
Lemma 3.12), we get on average over x ~ ¥V and v ~ Cx, we have 


[My ,@ i, *(v) 8 h8 Iy] 9 B > 0. 
By Lemma 8.13, this implies that on average over x ~ D, u ~ x, and V ~ Xx, 
[Mj 8 Iz, t“ (u)t* (v) @ h72 Iy] 8 Ip =: 0. 
By Lemma 8.12 and Lemma 8.10, this implies that on average over x ~ D, 
M}, a ® T78 Ip ~e (Tex ° T, (M},a)) 9 T78 Is 


=( Z jen Mjr) ©, Is, (71) 
y'EVx 


/ 
for some POVM measurement {Mya } on Hy,\u, ® Ha. 
In the following sequence of equations, whenever an operator does not act on a subsystem it should be 


assumed that it is appropriately tensored with the identity. For clarity, we explicitly indicate using a subscript 
A or B whether a Pauli operator acts on Ha or Hg. Then on average over x ~ D we have 


Mya = Mya My (M* is projective) 
Me Mia . (TE O=) A (Consistency assumption) 
Xo Mj, 18 (Tae (Paulis are self-consistent) 
Re (x (TE ()=y)) a @ Mya ) a (oaa (Equation (71)) 
y! 
Xo D (Ti, (=y Ti, O=) A® My (Paulis are self-consistent) 
yo 


= Gana & Mya , 


where =o indicates equality with respect to the state |EPR)y @ |AUX), and we have repeatedly used 
Fact 5.19. This is essentially the statement promised by the lemma, except that {Mya }, does not nec- 
essarily sum to identity (since we only sum over a). To remedy this, define M;’” = Ly! Min and note that 


M” > Myt , so by Lemma 8.14 and the fact that M* is projective, 
(My, a ® I) & Ig X g1/2 (o= Q My! Q Ir) Q Ig š (12) 


{M;" } is the POVM measurement guaranteed in the lemma statement, which concludes the proof. O 


106 


8.4.2 Preliminary lemmas 


We show a few simple lemmas that allow us to argue about measurements that have a decomposition across 
a tensor product of two Hilbert spaces, within the space of a single player. 


Lemma 8.16. Let A,B be finite sets. Let |p) € Ha ® Hg be a state. Consider the following: for all 
a €A, 


1. Let Haq, HB,o Haw and Hp’ a be Hilbert spaces such that 
Ha = Haia & Ha'a and Hp = HB, a & Hya , 


and let |Your, a) E Ha,a Q Hp,a be a “question state” and \Pans,a) © Ha a D Hy a be an “answer 
state” such that |Y) = |Wour,a) ® Wans, a)- 


2. Let Qa be projectors on H4,q such that {Q, Q Inyal forms a projective measurement on Ha and 
let { A? }vep and {Bi} veg be matrices acting on Hy a 


3. Let D be the distribution on A obtained by measuring |Y) using {Qa Q Iņ, bac: 
Then the following are equivalent: 


* On average over a ~ D and with respect to state |p), 


(In, 8 45) © Ip Se (Ing, © BE) @ Ie. 


* (Qa ® Af) Q Ip Se (Qa ® BY) ® Ip on state |p). 


Proof. Expand 


(Qa ® A5 — Qa 8 By) Q Ip - |Poue,a) ® |Pans,a) I? 


Qa Q (A; — By) Q In - |Pour,a) Q |Pans,a) I 


| 2 


II 
zM iM M 


2 
Qa D Inis alPove, a) ||” - | (Aj — By) @ Ing, ,|Pans,a) 


2 
=F IIQ © Teh)? - |(Af — BE) © Iry, alansa) | 
a,b 
2 
= E E|; — By) D Itty, , lWans, a| 
anD p 
2 
= EY || tx 8 (45 — Bi) 8 Is + |Pour,a) 8 |Pans, a) 
b 


In going from the third to the fourth line we used the fact that 


2 2 
| =o. @ buy, @ wl) | 


| Qe 8 Irin a|Pove.a) ||? = |] Qn ® Fry, ® Iolour,e) 8 lPans,a) 


Hence, the first line is O (e) if and only if the last one is. O 


107 


Lemma 8.17. Let A,B,C be finite sets. Let |p) E€ Ha ® Hp be a state, and let {Aq p}acA peg and 
{Ba c }ac A cec be POVMs acting on Ha. Suppose further that: 


1. The measurements approximately commute, i.e. 
[Aa br Bac] 8 lp %50, 
where the approximation holds with respect to the state |1p). 


2. For alla € A, there exist Hilbert spaces Ha, a Hy œ HB, o Hp,» and states |Pouz,a) © Ha,a Q 
Hp, a Pans, a) = Ha'a & Hp) a such that 
Ha = Haa SHa a j 
Hg = HpB,a ® Hp) a , 
|) = [Poue a) ® |Pans,a) - 


3. Foralla € A, there exist projectors Qa on Ha,q and matrices {Af peg, {B£ }ccc acting on Hy a 
such that {Q, ® Ty ai -J is a projective measurement on Ha and 


Aab = Qa 8 Aj, Bie = Qa Q BE. 
Then 
[Tine D Ab, Iaa ® Be] 8 B 5 0, 
on average over a ~ D where D is the distribution on A obtained by measuring |p) using {Qa Q 
Iny JacA 


Proof. The assumptions of the lemma imply that 


(Qn ® ASB!) © Ta = ((Qa 8 AS) « (Qa ® B2)) @ Ip (Qn isa projector) 
= (Aap Bac) Q IB (Item 3) 
~ (Bac: Aap) ® Ip (Item 1) 
= ((Q, D Bl) - (Qa ® Aj)) D IB (Item 3) 
= (Qı ® BlA}) S Ip. (Q, is a projector) 


We apply Lemma 8.16 as follows. The set “A” in Lemma 8.16 is the same as A here, and the set “B” is 
the product set B x C here. The matrices “{Aj}” are {A{B?} here and “{B?}” are {BZ A?} here. We then 
obtain, on average over a ~ D, 


(Trin. ® ApBe) Q Ip S8 (Iu, ® BeA) 8 Ie. 
This implies the conclusion of the lemma. O 


Lemma 8.18. Let Y be a finite set and for all y € Y let {A},z} be a POVM on Ha. Let {Bx,y,z} be a 
projective measurement on Hpg. Suppose that 


E (PJA ® Bry) 21-8.: (73) 


X,Y,Z 


Then with respect to state |), 
Ia 8 Bx,y,z Xs At, & By,y . 


108 


Proof. Using the fact that {Bx y, z} is projective, we have By, y,z = By yBz for all x, y, Z, so that (73) implies 


> (4| Až,z & Bx, yBz|p) 2 L=2. 


X, yZ 


Define Â; = Lixy AŽ zQ By,y and Bz = I, & Bz. The above equation simplifies to 


X (p|AzB.\p) > 1-6. 


Yra 


This implies that A, 65 Ê, as 
À x 2 A A A A 
L I(Âz - Balp) = LPIAZ + Bily) — 2(p|AzBelp) < 25, 
Z Z 
where the equality uses the fact that A, and Ê, commute. To conclude the proof, we have 


Ia & By, y,z = Ia & Bx,yBz Xs (Ia & Bx,y) D Bs & By y = Als & By y, 


x y! 


where the approximation follows from A, ~s B, and Fact 5.19. 


8.4.3 Soundness proof 


We analyze the soundness of the introspective verifier. 


Proof of the soundness part of Theorem 8.3. Let V = (S,D) be a normal form verifier such that S is an 
¢-level sampler. Recall the definition of the introspective verifier YNO = (Mo, HINTO) corresponding 
to V from Section 8.2. Fix an index n > 1 and let N = 2”. Let R = N* and introparams(R) = (q,m,d), 
as in Section 8.2. 

As in the proof of completeness part of Theorem 8.3, we make the following simplifications. First, 
we analyze the soundness of the typed introspective verifier PINTO; the soundness of the detyped verifier 
VINTRO — Detype( INTEO) follows from Lemma 6.18 and the fact that the type set T!XT®° has size O(¢). 
Second, analogously to Remark 8.6 without loss of generality for notational simplicity we consider strategies 
for questions sampled from S!NT®° rather than the downsized sampler S!NT®°, 

Suppose that val* (VINTRO) > 1 — e for some 0 < e < 1, and let Y = (|), A, Ê) be a strategy for 
VINTRO with value at least 1 — £. Since -Z has success probability that is strictly positive, the decider D!NT®° 
does not automatically reject, which means that 


s(N) < N°. (74) 


We analyze each of the tests performed by DINTRO (see Figure 10) in sequence, and state consequences of 
each test. We start with Item 1, the Pauli test. 


Lemma 8.19. There exists a function ô (e, R) = a( (log R)*e” + (log R)~°) for universal constants a > 1 
and 0 < b < 1 such that the following holds. Let Q = Mlogg, where M = 2". There is a projective 
strategy F’ = (|W), A,B) for VINT®° that succeeds with probability at least 1 — ô, and furthermore 


I) aB = JEPR2) or, Q |AUX) arg” (75) 


109 


for some bipartite state |AUX), and for allW € {X,Z}, 


PAULLW W PAULLW _ „W 
Ax = 05" 5 By =O, , (76) 


where o! acts on the first s(N) qubits of player A’s share (resp. B’s share) of |EPR2)®28. 


Proof. Given the definition of the type graph G'N™®°, for ((f4, a), (fg, &g)) sampled according to 4 INTRO y 
it holds that (£4, £3) € TPAC! x PAUL with constant probability. Therefore, conditioned on the Pauli test, 
Item 1, being executed, .Y must succeed in the test with probability 1 — O(e). 

Observe that conditioned on the test being executed, the distribution of ((f4, ĉa), (fg, &B)) is, by defi- 
nition, exactly the distribution of questions in the Pauli basis game with parameters qldparams, as described 
in Section 7.3.1. By Corollary 7.15 it follows that there exists a local isometry ¢ = da © dp and a state 
| AUX) E Har Q Hp” such that 


ll(1$)) — |EPR2)°° @ |aux)||? < ô' (e, R), (77) 


where ô' (e, R) is an upper bound on dap (O(e),q, m, d) that only depends on € and R, as stated in Lemma 7.17. 
In addition, defining Af = pa Ai pt for all questions £ and answers a, for W € {X, Z} it holds that 


Ape Q Ig ~6!(e,R) a Q Ig F (78) 


and a similar set of equations hold for operators associated with the second player. Using Naimark’s theorem 
as formulated in [JNV*20, Theorem 5.1], at the cost of extending the state |AUX) we may assume that 
the measurements are projective without loss of generality. Define the strategy .~’ which uses the state 
|EPR2)®2 Q |AUX) and measurement operators {Af} and {Bf} for all questions £, except for (PAULI, W)- 
type questions where instead the Pauli measurements oW are used. Using (77) and (78) the strategy 7’ 
succeeds in VINTRO with probability at least 1 — ô’ (e, R). 

The claimed bound on 6; follows from the bound given in Lemma 7.17. O 


In the remainder of the proof we analyze the strategy .7’ from Lemma 8.19. We use the following 
notation conventions: 


1. We use indices A and B to label each player’s Hilbert space after application of the isometry p from 
Lemma 8.19. 


2. We write V for the register subspace of FẸ spanned by e1, . . . , €s(y) and V for its complement. (Note 
that by definition of introparams in Section 7.3.2 it holds that s(N) < R < Q, where the first 
inequality follows from (74).) 


3. Whenever we write a Pauli operator 7” the register on which the operator acts should always be clear 
from context, and is implicit from the space in which the outcome a ranges. 


4. For measurement operators in the introspection game, the variables for the measurement outcomes 
follow the specification of the “answer key” in Figure 8 (for 7*4""'-type questions) and Figure 10 
(for all other question types). For example, the measurement operators tae corresponding 
to question type (PAULI, W) are indexed by vectors x € FẸ where Q = Mlogq where M = 2”. 
The measurement operators corresponding to question type (INTRO, v) for v € {A,B} are indexed 
by pairs (y,a) € V x {0, 1}53Q 34 We often refer to marginalized measurement operators, e.g., the 


34Technically the answer a may be a binary string of any length; however, if a is too long the decider rejects due to the answer 
length check. Thus we assume without loss of generality that the answer a is a binary string of length at most 3Q = 3-2” - logq. 


110 


operator Aye denotes marginalizing Aya ROW over all a. In these cases, the part of the answer that 


is marginalized over will be clear from context. 


5. We use the notation ô to denote a function which is polynomial in 61, although the exact expression 
may differ from occurrence to occurrence. The polynomial itself may depend on £, but we leave this 
dependence implicit; due to the use of inductive steps that involve taking the square root of the error 
£ times in sequence (e.g. Lemma 8.24) the exponent generally depends on £. 


The next two lemma derive conditions implied by Items 2 and 3 of the checks performed by the decider 
DINTRO described in Figure 10. As these two parts are performed independently for the two possible values 
of v € {A,B}, we only discuss the case where v = A. For notational simplicity, whenever possible we omit 
v when referring to the measurement operators. For example L, AN RO and BSAM?LE are used as shorthand 


INTRO, A SAMPLE, A 
> Aya Bza 


notation for Lê , and respectively. 


Lemma 8.20 (Sampling test, Item 2 of Figure 10). For each k € {1,2,...,€}, 


Ia Q BSMPE ~of & lp, (79) 
I p S 
Aywa @ Ip ~5 la ® Cee 7 (80) 


where z ranges over V and y<, ranges over L<(V). Moreover, analogous equations hold with operators 
acting on the other side of the tensor product. 


Proof. When ((Îa, ĉa), (fs, ĉB)) is sampled according to y Ŝismo „> €ach check in Item 2 of Figure 10 is 
executed with probability (1/2) (this is due to the number of types in J 'T®° and the structure of the type 
graph G!NTR°), Therefore, in each of the checks specified by Items 2a and 2b, the strategy .7’ succeeds 
with probability at least 1 — O(£ô), conditioned on the test being executed. Item 2a for w = A combined 
with (76) implies (79). Item 2b for w = A, combined with Fact 5.24, implies (80). The lemma follows from 
repeating the same argument with the tensor factors interchanged. O 


Lemma 8.21 (Hiding test, Item 3 of Figure 10). For each k € {1,...,4 +1}, 


Aa © Ip X51, @ Bye, (81) 
and ifk < £, 
HIDE _, Z 
In 8 Byc “5 ek (-)=yeal Si. (82) 
Furthermore, for all j,k € {1,...,£—1} such that į < k, we have 
AHE I y APPa I ; 
Y<j VR; @ tp %s Y<j VS; in (83) 


Analogous equations to (81), (82), and (83) hold with operators acting on the other side of the tensor 
product. 


Proof. When ((Îa, ĉa), (fe, ĉB)) is sampled according to y ĝinto p» Cach check in Item 3 of Figure 10 is 
executed with probability (1/2). Therefore, in each of the checks specified by Items 3a, 3b, and 3c 
(conditioned on the right types) the strategy -7’ succeeds with probability at least 1 — O(@60). 
Item 3a for w = A combined with Fact 5.24 implies (81). Combining Eqs. (79) and (80) with (81) 
yields 
In & BREAD Xs FE alya & Ig : (84) 


Y<k 


111 


The fact that .7’ is projective and succeeds in Items 3b and 3c of Figure 10 with probability 1 — O(46), 
along with Item | of Fact 5.18, imply (82). 

We now establish the “Furthermore” part of the lemma statement. Let 1 < j < k < &—1. Item 3c, 
Item | of Fact 5.18, and Fact 5.24 imply that 


A HIDE iret pEr. 
Yj Vzj @ tp sta ® Y<j YZ; (85) 
Item 5 and Fact 5.24 imply . 
AH Q Ip œ Ia QB. 
Yj VZj TRENAS Y<j-Yž; (Bo) 
This proves (83). 
The lemma follows from repeating the same arguments with the tensor factors interchanged. O 


We exploit the tests performed in Item 3 further to show the following lemma. 


Lemma 8.22. Forallk € {1,2,..., 4}, 


HIDE aS Z X X 
Ty By yeas O E 8 me) OB, 


ab no 
Yk Yk; X>k KY ek >k 


and an analogous equation holds with operators acting on the other side of the tensor product. 


Proof. The proof is by induction on k. We first show the case k = 1. Under the distribution y SINTRO y» the 
check in Item 3d of Fig. 10 is executed with probablity O(1/2). That part of the check for w = A together 
with Eq. (76) implies that 
HIDE; _, x x 

MOE ee I age 
This proves the case for k = 1. Next we perform the induction step. Assume that the lemma holds for some 
k € {1,2,...,€—1}. The check of Item 3c is executed with probablity (1/4); using that 7’ succeeds in 
Item 3c (conditioned on the right types having been sampled) with probablity at least 1 — O(0), we have 


Lo oaei g pia lp) = 1-018). 


f Yk Laaya YE X>k+1 Y<k View Xk 
Y<k Wiggs X>k41 = 

We now apply Lemma 8.18, choosing the measurements A, B and outcomes x, y, z in the lemma as 
follows: 


“A” HIDE, «p» . pHIDEk+1 
Vk [Lies yep) Yel >k ck Vien tok 7 
“X Yck “YP Sk “Z: (Yeu X>k41) - 
Note that here A does not depend on y, so we use the same A for all values of y. Lemma 8.18 with the 
above choices of parameters implies that 


HIDE HIDE HIDE, 
LOB . ms AU it Bye, ii 
Y<k Yki ¥>k+1 Y<k, Lis yo, O Vera) toe = 


as Z X X HIDE, +1 
<ô (or gt 75 8 "ee eal = Pastia) 8 Byc 

Z Z x x 
ô (Fe =yerl aA 8 TW aaa Veal O Orsi 


o (of 90% @ox,,) @ks, 


[Lek (-)=y<e+i] Peters (=v X>k+1 


Q 


) 8 Ip, 


Q 


112 


where the input to Le bye iC ) is X,41. The second approximation uses the induction hypothesis, Fact 5.24, 
Fact 5.18, and Fact 5.19. The third approximation follows from Eq. (82) and Fact 5.19. The fourth approxi- 
mation follows from the definition of CL functions. This completes the induction. O 


Lemma 8.23. Forallk € {1,...,€}, 


READ „~ Z X 
In @ B, yp © i a= © Trt, O=) 8 Ip. (88) 
Moreover, analogous equations hold with operators acting on the other side of the tensor product. 


Proof. Lemma 8.22 and Fact 5.24 imply that 


HIDE; a Z 
Aa 8 ie S0 (of ayal 8 cin Ley gk eg 8 Ip. (89) 
Since the strategy .7’ succeeds in Item 3b with probability at least 1 — O(6) it follows from Fact 5.24 that 
HIDEy READ | 
en Vey @ Ip = la @ By lr Yee" co) 


An inductive argument applied to (83) of Lemma 8.21, combined with Fact 5.24, implies that for all 1 < 
k < £ we have 


ATDHE, Ig APIE I . 1 

yey BAT eye B (21) 

Equations (89), (90), and (91), combined with Fact 5.24, then establishes the lemma statement. O 
Lemma 8.24. For eachk € {1,2,...,£+ 1}, there exists a product state |ANCk) = |ANCk a) ® |[ANCk B) € 
Ha 8 Hpy and, for each y <p € L<(V), a projective measurement tana that acts on Ha Q Har 


such that the following holds. First, for ally € V, the operator a Y< acts as identity on the register 


subspace spanned by basis vectors for the subspace Vex(Y <x), and as a consequence the operator 
INTRO, Z<k _ „Z INTRO, Y<k 
Aya = OTL (ayer) 8 Asie (92) 


is well-defined. Second, let Y' be the strategy defined as follows. The state is \EPRz)®2 & |AUX) @ 
|ANCx). The measurements are identical to those in Z’ defined in Lemma 8.19, except that aye is 


replaced with {Aya RO; fak }. Then F! succeeds with probability at least 1 — 6 in the game V}NT®°, 


Proof. The proof is by induction on k from 1 to £+ 1. The case k = 1 is trivial by setting An a 
Avy RO for all y,a. Assume that for some k € {1,2,...,¢} there exist projective measurements Ae roe 


for every y<, and a strategy ./’ satisfying the seadiaions of the lemma statement. We show the statement 
of the lemma for k + 1. 


Commutation with Z-basis measurements. We first prove that on average over y<ķ, the measurement 
operator AL ’Y<k which comes from the inductive assumption, commutes with the projective measure- 
ment {gő } where the outcomes zg range over the factor space Vi(y<x). 

To do so, we first apply Lemma 5.25 where we choose the measurements “A”, “B”, and “C” and 


6699 66 


outcomes “a”, “b”, and “c” in the lemma as follows: 
ee I Z oe ee 
A”: {Aya NTRO, <} i B”: (pene , C”: {o2} , 
“q ” : Y<k , “p” : : (Y>xa ) ; tept : z F 


113 


To make sense of how the “B” POVM is indexed by “a”, “b”, and “c”? as described above, we use the 
following relabelling: for all (z,a), D BSAMPLE with BSAMPLE where y = L(z). Similarly, for the “C” 
POVM., we identify 7 with the operator o c „z Where y<k = Lex(z). By applying Lemma 8.20 to .7/" (the 
strategy given by the inductive hypothesis) with “k” in Lemma 8.20 set to £, we have that 


Appr Q İg m5 Ia Q se aad (93) 


where BIA = Y zL ios LE’ Equations (79) and (93) imply that the conditions of Lemma 5.25 are 
aed and thus we onan 
INTRO, Z bay 
[Apa , 02 | @ lp = 0 (94) 
where in the answer summation, y is a deterministic function of z. We now apply Lemma 8.17, choosing 
the measurements “A”, “B”, “Q” and outcomes “a”, “b”, “c” in the lemma as follows: 


“A? I Ze “pe. “cry. 
A”: {Aya}, “BLOT ya) @ Pb “Q: ayat + 
“q 3 Wek , “p” : (Y>k, a) F sear? : Zk . 


Here, we write 72 Z to denote 
ere, "Lheg(-)=yexl 8 Oz, S 


z'EV: 
Lek(z')=Yek 
(2!) "kU <b) =z, 


We choose the Hilbert spaces “Ha,” and “Hg,” as the register subspace Vex(ycx), and “Ha,” and 
“Hy, g” as the register subspace V5; (y<x) tensored with Ha» ® Hg”, the Hilbert space of the state |AUX). 
Thus for every y <x, the state ) ) of the strategy .~/’ can be decomposed into a tensor product 
of a “question state” and an “answer state” as follows: 


(IEPR2)v..iy<2)) 5 


8 (IEPRo) vwa 8 Jaux) ) 


Aa HBa Halae! a 


Let uz, denote the distribution over outcomes y<x generated by performing the “Q” measurement on 
the state |EPR)®2, which is equivalent to the distribution generated by the following procedure: (1) sample 


a uniformly random z € V; (2) compute y = L(z); (3) return yey. Then, since Equation (94) and the 
INTRO ,Zex 


inductive hypothesis about the structure of Ay- a satisfy the conditions of Lemma 8.17, we obtain on 
average Over Yor ~ UL, 
INTRO, 
[Aya 0%] 8 In 5 0. (95) 


Here, the measurement outcomes zg range over Vi (yc). 


Commutation wh X-basis measurements. Next, we first prove that on average over y<ķ, the measure- 


RO, Y<k 


ment operator Aya a commutes with the projective measurement {Of 4 } where the outcomes 


L y aC =y 
yz are elements of the factor space Vi (yx). 
We again apply Lemma 5.25, choosing the measurements and outcomes in the lemma as follows: 
cA». INTRO, Z k “Dp. ari Eem Z 
A : {Aya ` } á B”: {By Land C”: Vraa 8 iL Leyal Hi 7 


“q ” “e 29 


Yak, “P: (Y>na), ‘Yes 


114 


Equations (81) (with “k” in Lemma 8.21 chosen to be Z + 1) and (88) imply the conditions of Lemma 5.25, 
so we obtain 


INTRO, Zex Z X x 
[Ane 2, oF (jayee) © Te, Oy] 2 B520. we) 


We then apply Lemma 8.17 with the following choice of measurements and outcomes: 


(23 E E INTRO, Z<k “py”. Z X (73 Cr Z 

A”: {Aya Se “BAL =v D ey On O Taya” 
“aq? : Yek P “bh” : (Y>k a) , an : yt . 
The Hilbert space and state decomposition are the same as in the previous invocation of Lemma 8.17. 
Equation (96) and the inductive hypothesis satisfy the conditions of Lemma 8.17, and we similarly obtain 
that on average over Y<k ~ LL,» 


[annoa oe, Se @ Ip 50, (97) 


L 
k,Y<k 


Applying the Pauli twirl. The last step is to apply the Pauli twirl to decompose the family of measure- 
ments E seh } into a tensor product measurement, with the first part of the tensor product measuring 
the k-th linear map of L. 


Again applying Lemma 8.20 to .~/’, and using Facts 5.24 and 5.18, we obtain that 


An 8 Ip &5 Oe 8 Ip 


which is equivalent to, by the inductive hypothesis, 


Thea)=ya| @An 7E O Te Xs oF @ Trygt) © B: (98) 


(-)=yeal [Lek(-)=y<xl 


Applying Lemma 8.16 to Equation (98), we conclude that 


INTRO, y< ~~. AZ 
Ay @ lp So oh, (8B, ™) 


on average Over Yer ~ UL;- 
Now we apply Lemma 8.15 with the following identification: 


66.99 664 99 


eye, “Wik, “APs (Yona), “Le”: Lenya 
« >. c6 2., INTRO, y k éc X, Ys , INTRO, Y<k 
Ux : Vi(Yek) , My, Aypa < , Mi : Ay : 


>k A 


The “Consistency” condition is implied by Equation (99) and the “Commutation” conditions are implied by 
Equations (95) and (97). We obtain that for all y<; there exists POVM measurements {Ana YSK that 
act as identity on the register subspace spanned by basis vectors for the subspace V-;,(y<,) and, on average 


over Yer ~ UL p» We have 


INTRO, y PR Z INTRO, y< 
Ays,,a 1O Ip ~ (haO 8 Aysa if) ® Íg. (100) 
Using the inductive assumption on the structure of AY RO, Z< from (92), we get 
INTRO, Z = Z INTRO, y 
Aya @ Ib = (Oh ataya 2 Aeee”) ® Ie (101) 
©% Z INTRO, Y<k 
6 (katya 8 Avoid) @ (102) 


115 


where the second line follows from (100) and Lemma 8.16. By (102), replacing the projective measurement 
{Ayn <1 with the POVM 


INTRO, Y<k 
: " : n sae INTRO, Y<k 
in the strategy .," results in a strategy that succeeds with probability at least 1 — ô. To show that { Ay, ca 
can furthermore be turned into a projective measurement we use Naimark’s theorem as formulated in [JNV* 20, 
Theorem 5.1]. In particular, we see that the state |ANC’) it produces is a product state with no entanglement. 
This yields a strategy .%//, with ancilla state |ANC,41) = |ANC,)|ANC’), which establishes the induction 


+ 
hypothesis for k + 1. This completes the proof of Lemma 8.24. O 


Taking k = £ + 1 in Lemma 8.24 we obtain a strategy 7” = .7;'_, with value 1 — ô in which, given 
question (INTRO, A), player A performs the measurement 


(ofian 8 Aa), (103) 


for a family of measurements E } acting on the state |EPR)7 ® |AUX) @ |ANC) where |ANC) = 
|ANCy+1) is unentangled. An analogous argument for Player B’s measurements shows that we may addi- 
tionally assume Player B responds to the question (INTRO, B) using the measurement 


(oiee 8 Be} , (104) 


for a family of measurements m Y} acting on the state |EPR)y ® |AUX) ® |ANC’), where |ANC’) is 
unentangled. Summarizing, the strategy .7” uses the state 


|0) = |EPR)22 @ |aux) @ |ANC) @ |ANC’) , 


and the measurements given by Equations (103) and (104). 
To conclude the proof of the soundness part of the theorem we analyze Item 4 in Figure 10. The test 
in Item 4 is executed with probability (1/2), so the strategy .Y” succeeds with probability at least 1 — ô 
in that test, conditioned on the right types (here we absorb factors of O(¢) into ô). Using (103) and (104), 
conditioned on the test being executed the distribution of the part (ya, yp) of the players’ answers in the 
test is exactly the distribution 4s, y associated with game Vy. As a result, the strategy which uses the state 
|EPR)7 @ |AUX) @ |ANC) @ |ANC’) and measurements {AY}, {BIY} succeeds with probability 
at least 1 — 6 in the game Vy. Thus, val* (Vy) > 1 — ô. It remains to show that ô has the required form. As 
6 = poly(6’), o'(e,R) = a' ((log R)” + (log RR) ), and R = (2")4, there is a constant C > 1 such 
that 
1 
sle n) < Cha ((An)" e” + (An) )) ü 
C(A eC + (An) e) 
< a((An)" -eè + (An)~*), 


for a = max{C(a')!/C,a'/C}, and b = b'/C, where the second line uses the inequality (x + y)!/° < 
xV/C + yC for x,y > Oand C > 1. 

This establishes the soundness part of the theorem. 

To show the entanglement bound, we observe that local isometries do not change the Schmidt rank of 
a state. Define |p’) = (|W)) @ |ANC) @ |ANC’). Since the strategy (|p), {ATOY}, {BIT is 


116 


5-close to (|), {ATY}, {BY }) which has value 1 — 4, the strategy (|p), {APY}, {BIT 
has value at least 1 — 26 in the game Vy, and therefore the Schmidt rank of (|y) (and thus of |~)) must 
be at least &(Vy,1— 20). Here, we use the fact that ancilla states are product states and therefore have 
Schmidt rank 1. 

Moreover, recall that @(|)) is d-close to |EPR2)®2 @ |AUX) whose Schmidt coefficients are all at 
most 2~ 2/2, For any bipartite state |a} with Schmidt rank at most R and |b) whose Schmidt coefficients are 
all at most £, it follows from the Cauchy-Schwarz inequality that |(a|b) |? < RB?. Therefore the Schmidt 
rank of (|) ) (and thus of |y)) is at least 


Gao 2S ia)’, 
where we used that Q = Mlogq > R (using the canonical parameter settings of Definition 7.16) and 


ô > ||P(|p)) — |@)||? = 2 — 2R(O|p(|)). Combining the two lower bounds on the Schmidt rank of |) 
shows the desired lower bound on &(VINTP°, 1 — e). oO 


117 


9 Oracularization 


9.1 Overview 


In this section we introduce the oracularization transformation. At a high level, the oracularization 604° 
of a nonlocal game © is intended to implement the following: one player (called the oracle player) is 
supposed to receive questions (x,y) meant for both players in the original game 6, and the other player 
(called the isolated player) only receives either x or y (but not both), along with a label indicating which 
player in the original game the question is associated with (we refer to such players as the original players, 
e.g. “original A player” and “original B player’). The oracle player is supposed to respond with an answer 
pair (a,b), and the other player is supposed to respond with an answer c. The oracle and isolated player win 
the oracularized game if (x, y,a,b) satisfies the predicate of the original game 6 and the isolated player’s 
answer is consistent with the oracle player’s answer. 

The oracularization step is needed in preparation to the next section, in which we perform answer re- 
duction on the introspection game. To implement answer reduction we need at least one player to be able 
to compute a proof, in the form of a PCP, that the decider of the original game would have accepted the 
questions (x,y) and answers (a,b). This requires the player to have access to both questions, and be able to 
compute both answers. 


9.2 Oracularizing normal form verifiers 


Let V = (S,D) be a normal form verifier such that S is an ¢-level sampler for some £ > 0. We specify the 
typed oracularized verifier YOR = (ORAC, DO®*) associated with V as follows. 


Sampler. Define the type set TOP^ = {ORACLE, A,B}. (In the remainder of this section we refer to 
the types in T°®“© as roles.) Define the type graph GO®*S that is the complete graph on vertex set 7 OPAS 
(including self-loops on all vertices). Define the T°®*C-type sampler ORAC as follows. For a fixed index 
n € N, let V be the ambient space of S and L” for w € {A,B} be the pair of CL functions of S on index 
n. 

Define two 7 °®^S-typed families of CL functions {L” : V + V}, for w € {A,B} and t € TOR*S, as 
follows: 
w- fe ift € {A,B}, 

Id ift = ORACLE. 


In other words, if a player gets the type t € {A,B}, then they get the question that original player t would 
have received in the game played by V,,. If they get type t = ORACLE, then they get the entire seed z that is 
used by the sampler S, from which they can compute both L“(z) and LP (z), the pair of questions sampled 
for the players in game V,. 

By definition, the sampler distribution } gorac_,, has the following properties. 


1. Conditioned on both players receiving the ORACLE role, both players receive z for a uniformly ran- 
dom z € V. 


2. Conditioned on both players receiving the isolated player role, the player(s) with role A (respec- 
tively, B) receives L“(z) (respectively, L? (z)) for a uniformly random z € V. 


3. Conditioned on player w € {A,B} receiving the ORACLE role and player W receiving the isolated 
player role, their question tuple is distributed according to ((ORACLE,Z), (v, L°(z))) if w = A and 


118 


((v, L°(z)), (ORACLE, z)) if w = B, where z € V is uniformly random and v indicates the role of 
player W. 


Decider. The typed decider D®*¢ is specified in Figure 12. 


Input to decider DO®*¢: (n,ta,Xa,tp,Xp,4a,ap). For w € {A,B}, if ty = ORACLE, then 
parse Aw as a pair (Aw,A, Aw, ). Perform the following steps sequentially. 


1. (Game check). For all w € {A,B}, if ty = ORACLE, then compute xy = L” (xw) for 
v € {A,B}. If D rejects (n, Xw, A, Xw,B, Iw,A, Aw,B ), then reject. 
2. (Consistency checks). 
(a) Ifta = tg and aa Æ ap, then reject. 
(b) If for some w € {A,B}, ty = ORACLE, tw € {A,B}, and aw, ty Æ az, then reject. 


3. Accept if none of the preceding steps rejects. 


Figure 12: Specification of the typed decider ORAS, 


9.3 Completeness and complexity of the oracularized verifier 
We determine the complexity of the oracularized verifier and establish the completeness property. 


Theorem 9.1 (Completeness and complexity of the oracularized verifier). Let V = (S,D) be a normal 
form verifier. Let YOR*® = (ORAC, DORAC) be the corresponding typed oracularized verifier. Then the 


following hold. 


e (Completeness) For alln € N, if Vn has a PCC strategy of value 1, then po RAC has an SPCC strategy 
of value 1. 


e (Sampler complexity) The sampler S°®*< depends only on S (and not on D). Moreover, the time 
complexity of SO®* satisfies 


TIME gorac (n) = O (TIMEs(n)), 
Furthermore, if S is an €-level sampler, then S°®* is a ¢-level typed sampler. 
* (Decider complexity) The time complexity of DO®“ satisfies 
TIME porac (n) = poly (TIMEp (n), TIMEs(n)) . 
e (Efficient computability ) There is a Turing machine ComputeOracleVerifier which takes as input 
V = (S, D) and returns YOR = (Orac, DORAC) in time poly(|V]). 


Remark 9.2. Unlike with the Introspection transformation, we do not detype the oracularized verifier 
YORAC: this is because the analysis of the Answer Reduction transformation in Section 10 directly reduces 
to the analysis of the typed oracularized verifier YO®*“ defined here. 


Proof. We analyze the completeness and complexity properties of the typed verifier Yorac, 


119 


Completeness. Forn € N, let Z = (|p), A,B) be a PCC strategy for V,, with value 1. Consider the 
following symmetric strategy WOR° = (|p), M) for VORAS, Depending on the role received, each player 
performs the following: 


1. Suppose the player receives role v € {A,B} and question x. Then the player performs the measure- 
ment that player v would on question x according to strategy Z to obtain outcome a (either {AX} or 
{B*\, depending on v). The player replies with a. 


2. Suppose the player receives role v = ORACLE and question x. The player first computes yw = L” (x) 
for w € {A,B} where for w € {A,B}, L” is the CL functions of S corresponding to player w. Then, 


ORACLE, X 


the player measures using the POVM {Mz ap ”° } where 


MORACLE, x — BY AY ; (105) 


AA, AB 


The projectors Ay and Be commute because (ya, yB) is distributed according to Ws, , (over the 
choice of x) and .Y is a commuting strategy for V,,. Thus Me is a projector. The player replies 


with (aa, ap). 


The strategy -ORAC is symmetric and projective by construction, and consistency follows from the 
consistency of .7. We now argue that the strategy is commuting and has value 1 in the game VO?“°, We 
consider all possible pairs of roles. 


1. (Oracle, isolated) Suppose without loss of generality that player w = A gets the ORACLE role and 
player wW = B gets the isolated player B role. Then player w gets question x and player W gets 
question LP (x), where x is uniformly sampled from V. The oracle player computes yy = L”(x) 
for all v € {A,B}. Notice that (ya, yp) is distributed according to jig,,. The two players return 
((aa,4p), 4) with probability 

(p| Meee Q Be lp) 
B 


AA, AB 


= (p|Bas Ang © By? |p) 

= (p| Azk ® Bas By? |p) 

= Sag, a (V| AnA ® Bas |p) , 

where the first equality uses the definition of Cr from Eq. (105), the second equality uses 
the consistency of .7, and the third equality uses the projectivity of .7. Notice that when ag = a, 
this is exactly the probability of obtaining answers (a4, ag ) when player A and player B get question 
pair (Ya, YB) in the game Y,,. Since Z is value-1, the answers satisfy the decision procedure of V, 
with probability 1. Thus the oracle’s answers pass the “Game check” of the oracularized decider with 
certainty, and furthermore the oracle’s answers are consistent with the isolated player’s answers and 
thus pass the “Consistency check” with certainty as well. 


MORACLE, x 


an, üp and BI follows from the commutativity of Z for the game V,. 


Commutativity of 


2. (Both oracle) If both players get the ORACLE role, then both players receive the same question x € V. 
Using a similar analysis as for the previous item, the players return the same answer pair (thus passing 
the “Consistency check”) and pass the “Game check”. Both players’ measurements commute because 
they are identical. 


120 


3. (Both isolated) Suppose that both players receive the same isolated player role (e.g., they both receive 
the isolated player role A). They then perform the same measurements, which produce the same 
outcomes due to the consistency of the strategy .”, and thus they pass the “Consistency check”. 
Otherwise, suppose that one player receives the A role and the other player receives the B role. Then 
the decider DORAC automatically accepts. Furthermore, their measurements commute because their 
questions are distributed according to Ws n, and .Y is a commuting strategy with respect to Us, n- 


Complexity. It is clear from the definition that SO®4¢ depends only on S. The time complexity of the 
sampler ORAC is dominated by those of the sampler S. The complexity of DORAS is dominated by the 
complexity of both S and D and performing consistency checks. The sampler SORAC is a max{/,1}-level 
sampler because S is an ¢-level sampler and the new CL functions for t = ORACLE are 1-level. 


Efficient computability. The description of SORAC can be computed, in polynomial time, from the de- 
scription of S alone. The description of DRC can be computed in polynomial time from the descriptions 
of S and D. Moreover, in each case the computation amounts to copying the description of S and D 
respectively, and adding constant-sized additional instructions. 

O 


9.4 Soundness of the oracularized verifier 


Theorem 9.3 (Soundness of the oracularized verifier). Let V = (S,D) be a normal form verifier and 
YORAC — (SORAC DORAC) he the corresponding typed oracularized verifier. Then there exists a function 
d(€) = poly(e) such that for all n € N the following hold. 


1. Ifval* (VOM) > 1 =, thenval*(V,) > 1 — ô(e). 
2. For alle > 0, we have that 
E(DPRAC,1—£) > E(Vu, 1 - 4()) 
where &(-) is as in Definition 5.12. 


Proof. Fixn € N. Let °° = (|p), A, B) be a strategy for VORA with value 1 — e for some 0 < e < 1. 
Using Lemma 5.7 we may without loss of generality assume that .7°" is projective. Let (t,x) € TO®** x 
V be a question to player w = A. In the event that t = ORACLE (which occurs with probability 1/3), let 
Yv = L?” (x) for each v € {A,B}. From the consistency check performed by D°®*¢ and item 1 of Fact 5.18, 
we have that for all v € {A,B} and on average over x sampled by ORAC, 


fee Q Ig Me In Q By” . (106) 


Here, we used that with probability 1/9 player w = A gets the ORACLE role and player W = B gets the 
isolated player v role; conditioned on this, player B gets question yv. 
Using Fact 5.19 on (106), with the C operators from Fact 5.19 chosen as lAa} here we get 


ORACLE, x ORACLE ORACLE 
Ane @ lp Ana (Age? Ba) 


44,4R 44,48 
os ORACLE, x B, yB 
~e Asaa ` (Ia 8 Bag ) 
— AORACLE, x B, yB 
= Annag Q Bay j (107) 


121 


AQORACLE, x 
44,08 


where for the first equality we used that the POVM elements { 
calculation for the second and third approximations below we get 


} are projective. Using a similar 


ORACLE, x oy ORACLE, x B, yB 

Aan, a @ Ip ~e Az, ap ® Bag 
EE ORACLE, x B, yB 
oe Ag: Q Bay 
~ B, YB pA, YA 
oe Ia & Bag Bag 


any A,y B, yg 
oe Ia & Ba B 7 


(108) 


where the last line uses that starting from Ap a T & Ig we may perform all operations leading to the 


before-last line hwile reversing the order in which B® and B^ are introduced. 
Using Item 2a of the consistency check, we have that on average over a random y = L^ (x) € LA(V), 


ASY @ Ip = Ia Q BY’. (109) 


Define, for all y € Lê(V), measurement operators {Ci}, where CY = Any . Similarly, define DJ = B? ga 
This defines a strategy Y = (|p), C, D) for the game V, that we now argue succeeds with high probability. 
Let x € V be uniformly random. Let ya = Lê (x) and yg = LB (x). 
B, yB pA, 
Ae Q Ig Me In Q Ba” Bo 


ay A,YA B, yB 
rE Ag, Q Bag 


The first approximation follows from Equation (108). The second approximation follows from Equa- 
tion (109) and Fact 5.19 (where we let C in Fact 5.19 represent I & BÈ w). The last equality follows 
from definition of C^ and D% . The pair of questions (ya, yg) € V x V is distributed according to pig, n- 
The game check part of ÔORAC succeeds with probability 1 — O (e), which implies that the answer pair 
(a,,4p) that arises from the measurement A & Ig is accepted by the decider D on question pair 
(ya, yB) with probability 1 — O(e). This in turn implies that the strategy 7 = (|p),C,D) succeeds with 
probability 1 — O(,/é) in the game V„. As an additional consequence, the Schmidt rank of |y} must be at 


least £ (Vn, 1 — O(/é)). 


122 


10 Answer Reduction 


In this section we show how to transform a normal form verifier Y = (S,D) into an “answer reduced” 
normal form verifier VAR = (SAR, DAR) such that the values of the associated nonlocal games are directly 
related, yet the answer-reduced verifier’s decision runtime is only polylogarithmic in the answer length of 
the original verifier (the answer-reduced verifier’s sampling runtime remains polynomially related to the 
sampling runtime of S). The polylogarithmic dependence is achieved by composing a probabilistically 
checkable proof (PCP) with the oracularized verifier given in Section 9. This step generalizes the answer 
reduction technique of [NW 19, Part V]. 

Given index n € N, the answer reduced verifier VAP simulates the oracularization VO®“° of V on 
index n. To do so, it first samples questions x and y using the oracularized sampler SO“ and distributes 
them to the players, who compute answers a and b. Let us suppose that the first player is assigned the 
ORACLE role, and parse their question and answer as pairs x = (x4, Xg) and a = (a,,dg), while the 
second player is an “isolated” player receiving the question xa and responding with answer b. Instead of 
executing the decider DO®*° on the answers (a,b), the verifier VAR asks the first player to compute a PCP 
II of D(n, xa,Xp,4a,4p) = 1, and the second player to compute an encoding gp of answer b. VAR then 
requests randomly chosen locations of the proof II and the encoding g,, and executes the PCP verifier on 
the players’ answers. By the soundness of the PCP, VAR accepts with high probability only if the player’s 
answers satisfy D (n, xa,Xp,4a,4p) = 1 and b = a4. 

There are several challenges that arise when implementing answer reduction. One challenge, already 
encountered in [NW19], is that we need to ensure the PCP II computed by the first player can be cross-tested 
against the encoding g, computed by the second player (who doesn’t know the entire structure of the PCP 
II). This was handled in [NW19] by using a special type of PCP called a probabilistically checkable proof 
of proximity (PCPP), which allows one to efficiently check that a specific string x is a satisfying assignment 
to a Boolean formula p, as opposed to simply checking that ọ is satisfiable. In a PCPP, an encoding of 
the specific string x is provided separately from the proof of satisfiability. The answer reduction scheme 
of [NW 19] was able to use an “off-the-shelf” PCPP in a relatively black-box fashion to handle this. 

In our answer reduction scheme, however, there is a further requirement: we need the question distri- 
bution of the answer reduced verifier to be conditionally linear. This is necessary to maintain the invariant 
that the verifier after each step of the compression procedure (introspection, answer reduction, parallel repe- 
tition) is a normal form verifier. Unfortunately, simulating the question distributions of off-the-shelf PCPPs 
with conditionally linear distributions can be quite cumbersome. Instead, we design a bespoke PCP verifier 
for the protocol whose question distribution is more easily seen to be conditionally linear. 

This section is organized as follows. We start with some preliminaries on formulas and encodings 
in Section 10.1. In Section 10.2 we show how to use the Cook-Levin reduction to reduce the Bounded 
Halting problem for deciders to a succinct satisfiability problem called Succinct-3SAT. Following this, in 
Section 10.3, we reduce the Succinct-3SAT instance to an instance of a related problem called Succinct 
Decoupled 5SAT, which is easier to use in our answer reduction step. Then in Section 10.4 we introduce 
a PCP for Succinct Decoupled 5SAT. The verifier for the PCP expects a proof consisting of the evaluation 
tables of low-degree polynomials, including the low-degree encodings of the players’ answers a and b. 
In Section 10.5 we provide the definition of a normal-form verifier VAR that executes the composition of 
YORAC with the PCP verifier from Section 10.4. In Section 10.6 we show completeness of the construction 
and analyze its complexity. In Section 10.7 we prove soundness. 


123 


10.1 Circuit preliminaries 


Recall the definitions pertaining to Turing machines fromSection 3.1. In this section it is useful to model 
computation using Boolean circuits. A Boolean circuit is a network of Boolean gates, connected by directed 
“wires.” Incoming wires encode the input to the circuit, and outgoing wires encode the output. For the most 
part, we will assume that the output of the circuit is a single bit, corresponding to a single output wire. The 
in-degree of each gate (the “fan-in’’) is restricted be at most 2, but there is no restriction on the out-degree 
(the “fan-out’”). The size of a circuit is the total number of gates and wires that it contains. For a more 
detailed description, see Section 4.3 of [Pap94]. 


Remark 10.1 (Plugging integers into circuits). Let C be a circuit with a single input of length n. Inputs to C 
are strings x € {0,1}". In this section, we will also allow C to receive inputs a € {0,1,...,2" — 1}. In 
doing so, we use the convention that a number a between 0 and 2" — 1 is interpreted as its n-digit binary 
encoding binary (a) when provided as input to a set of n single-bit wires. In other words, C (a) = C(x), 
where x = binary,,(a). 

More generally, if the circuit C has k different inputs of length nı,...,ng, then we can evaluate it on 
inputs a, € {0,1,...,2" —1},...,a, E€ {0,1,...,2" — 1} as follows: 


C(a,. $ ge) = C(x1,. 7 nal , 
where xı = binary, (a1), .. -, Xk = binary,, (a). 


A 3SAT formula is a Boolean formula in conjunctive normal form in which at most three literals appear 
in each clause. More precisely, p is a 3SAT formula on N variables x1, X2,...,Xy if it has the form KS Cj 
and each clause C; is the disjunction of at most three literals, where a literal is either a variable x; or its 
negation —x;. We use x? to denote the literal x; if o = 1 and ~x; if o = 0. 


Definition 10.2 (Succinct description of 3SAT formulas). Let N = 2", and let g be a 3SAT formula on N 
variables named x9,...,Xy_1. Let C be a Boolean circuit with 3 inputs of length n and three single-bit 
inputs. Then C is a succinct description of ọ if for each i1, i2,i3 E€ {0,1,...,N — 1} and 01, 02,03 € {0,1}, 


C (iz, i2,13,01,02,03) = 1 (110) 
if and only if x; V x V x; is a clause in g. In Equation (110), we use the notation from Remark 10.1. 


Definition 10.3 (Succinct-3SAT problem). The Succinct-3SAT problem is the language containing encod- 
ings of circuits C in which C is a succinct description of a satisfiable 3SAT formula 9. 


The Tseitin transformation is a mapping from circuits to Boolean formulas. The following summarizes 
its properties; for an explicit construction, see [NW19, Section 3.8]. 


Definition 10.4. Let C be a Boolean circuit with n inputs and size s. Then the corresponding Tseitin formula 
F is a Boolean formula on n + s variables with the property that for all x € {0,1}", C(x) = 1 if and only 
if there exists w € {0,1} such that F(x,s) = 1. The formula F has size O(s) and a description of it is 
computable from C in time O(s). 


Definition 10.5. Let F be a Boolean formula over m variables. The arithmetization Farith over Fy is a 
function Frith : Fy — IF, such that 


Yx € {0,1}", Faritn(x) = F(x). (111) 


124 


Proposition 10.6. Let C be a circuit on n inputs with size s. Then the arithmetization Farin of its Tseitin 
formula is a polynomial over F; on n + s variables with individual degree at most 2 in each variable. For 
admissible field sizes q, a description of the polynomial F riin can be computed in poly(s, log q) time it can 
be evaluated at a specific point z € Fọ™ in time poly(s, log q). 


Proof. The properties follow by inspecting the construction in Definition 3.27 and 3.28 of [NW19], together 
with Lemma 3.18. In particular, the degree bound follows by observing that every variable in F occurs at 
most twice, and therefore the arithmetization Farith has individual degree 2. 0 


10.2 A Cook-Levin theorem for bounded deciders 


Definition 10.7 (Bounded Halting problem). The k-input Bounded Halting problem is the language BH, 
containing the set of tuples (a, T,Z1,...,Z;) where « is the description of a k-input Turing machine, T € N, 
Z1,.--,Z~ E€ {0,1}*, and M, accepts input (z1,...,Z,) in at most T time steps. 


We begin by defining natural encodings of a decider’s tape alphabet and set of states. 


Definition 10.8 (Decider encodings). Let D be a decider with tape alphabet T = {0,1,U} and set of 
states K. We will write encr : T U {0} — {0,1}? for the function which encodes the elements of T, and a 
special “l” symbol described below, as length-two binary strings in the following manner: 


00, encr(1) =01, 


encr (0) P 
10, encr(O) = 11. 


encr (U) 


In addition, we write encx : K — {0,1}* for some arbitrary fixed x-bit encoding of the elements of K, 
where x = [log(|K])]. 


Now, we give the main result of this section. It states that any decider D can be converted into a circuit C 
which succinctly represents a 3SAT formula P3sar that carries out the time T computation of D. In addition, 
C is extremely small—size poly log(T) rather than poly(T). 


Proposition 10.9 (Succinct representation of deciders). There is an algorithm with the following properties. 
Let D be a decider, let n, T, Q, and o be integers with Q < T and |D| < Ø, and let x and y be strings of 
length at most Q. Then on input (D,n,T,Q,0,x,y), the algorithm outputs a circuit C on 3m + 3 inputs 
which succinctly describes a 3SAT formula g3gay on M = 2” variables. Furthermore, @3sat has the 
following property: 


e For alla,b € {0,1}*!, there exists ac € {0,1}”@~*" such that w = (a,b,c) satisfies @3gar if and 
only if there exist Apretix: prefix € {0,1}* of lengths La, & < T, respectively, such that 


T-£y — T—£ 
a = encr (prefix, ") and b = encr(bprefix O") 
and D accepts (n, xX, Y, Aprefixs Dprefix) in time T. 
Finally, the following statements hold: 


I. The parameter m controlling the number of inputs to the circuit depends only on T and o, and 


m(T,o) = O(log(T) + log(7)), 


2. C has at most s(n, T,Q,c) = poly(log(n),log(T),Q,) gates, 


125 


3. The algorithm runs in time poly (log(n),log(T), Q,c), 


4. Furthermore, explicit values for m(T,a) and s(n,T,Q,c) can be computed in time polynomial in 


n,log(T),Q,¢. 


Proposition 10.9 is essentially the standard fact that Succinct-3SAT is an NEXP-complete language, 
i.e. that every nondeterministic computation which takes time 2” can be represented as a Succinct-3SAT 
instance of size only poly(n). However, it has several peculiarities which requires us to prove it from 
scratch rather than simply appealing to the NEXP-completeness of Succinct-3SAT. First, we require that the 
coordinates of a and b embed into w not randomly but as its lexicographically first coordinates (for reasons 
that are explained below in Section 10.3). Second, we need explicit bounds on how quantities such as the 
size of C relate to quantities such as g, an upper bound on the description length of D in bits. 

To prove Proposition 10.9, we follow the standard proof that Succinct-3SAT is NEXP-complete as pre- 
sented in [Pap94]. This proof observes that the Cook-Levin reduction, which is used to show that 3SAT 
is NP-complete, produces a 3SAT instance whose clauses follow such a simple pattern that they can be 
described succinctly using an exponentially-smaller circuit. One key difference in our proof is that we will 
apply the Cook-Levin reduction directly to the 5-input Turing machine D, which by Section 3.1 has 7 tapes; 
traditional proofs such as the one in [Pap94] would first convert D to a single-tape Turing machine Dingle, 
and then apply the Cook-Levin reduction for single-tape Turing machines to Dgingie. Though this adds no- 
tational overhead to our proof, it allows us to more easily track of which variables in @3sat correspond to 
the strings a and b (see Proposition 10.9 to see what these refer to). 


Proof of Proposition 10.9. The Cook-Levin reduction considers the execution tableau of D when run for 
time T. The execution tableau contains, for each time t € {1,...,T}, variables describing the state of D 
and the contents of each of its tape cells at time t. More formally, it consists of the following three sets of 
variables. 


1. For each time t € {1,...,T}, tape i € {1,...,7}, and tape position j € {0,1,...,T +1}, the 
tableau contains two Boolean-valued variables 


Craig = (Crijr,Crij2) € {0,1} 


which are supposed to correspond to the contents of the j-th tape cell on tape 7 at time t according 
to encr(-). The variables with j € {0,T +1} do not correspond to any cell on the tape; rather, the 
j = 0 variables correspond to the left-boundary of the tape, and the j = T + 1 variables correspond 
to the right-boundary of the first T cells on the tape. These are expected to always contain the special 
boundary symbol “L”, i.e. c;;,; should be equal to encr(L]) whenever j € {0,T +1}. As we will 
see below, it is convenient to define these so that for each t € {1,...,T},i € {1,...,7}, and 
j € {1,...,T}, the variable Cri; also has a variable to its left c; j—1 and to its right cy; j+1- 


2. For each time t € {1,...,T}, tape i € {1,...,7}, and tape position j € {0,1,...,T +1}, the 
tableau contains Boolean-valued variables h,;,; € {0,1} which are supposed to indicate whether the 
i-th tape head is in cell j at time t. For the boundary cells į € {0, T + 1}, we expect that h,,;,; = 0 for 
allt € {1,...,T} andi € {1,...,7}. 


3. For each time t € {1,...,T}, the tableau contains x Boolean-valued variables 
St = (St1,---,Stx) e410” 


which are supposed to correspond to the state of D at time t according to encg(-). 


126 


Finally, we let V denote the set of all of these variables. In other words, 
V = {crijpeheipkU (hei bei U {Stk}tk 
In total, the number of variables in the execution tableau is given by 
|V| = O(T? +T -log(|K|)) = O(T? +T -log(|D|)) . (112) 


The first term in Equation (112) corresponds to the tape cell encodings c;;; and ht i,j, and the second term 
corresponds to the Turing machine state encodings s+. The second equality uses the fact that |K| < |D]. 

As stated above, we expect the variables in V to correspond to some time-T execution of the decider D. 
However, in general these are just arbitrary {0,1}-valued variables. We now describe a set of constraints 
placed on these variables which, if satisfied, ensure they do indeed correspond to some time-T execution 
of D. These constraints will be split into two categories: (i) the constraints corresponding to the boundary, 
which ensure that the t = 1 variables are initialized to a valid starting configuration and the j € {0,T +1} 
variables are set according to Items | and 2, and (ii) the constraints corresponding to the execution of D, 
which ensure that the variables at each time (t+ 1) follow from the variables at time t according to the 
computation of D. We start with the boundary constraints, which are simple enough to be described with a 
3SAT formula. 


Definition 10.10. The boundary formula PBoundary İs the 3SAT formula on the variables V described as 
follows. Let o = encx(start) € {0,1}*, where start € K is the start state of D. For the t = 1 boundary, 
PBoundary Contains the following set of clauses. 


Indices Clauses 

i € {1,...,7} hiin (113) 
i€ {1,...,7}, j £1 Thy i,j (114) 
k € lace) sth (115) 
ie {1,...,7}, j€ {1,...,T} Ci, j1 V TCL j,2 (116) 
ie (Nip j<] E {L.,T} Denija V Cija (117) 
i € {6,7}, j € {1,...,T} C1,ij1 and =c1,;,j,2 (118) 


This is meant to be read as follows: for each row, the “Indices” column specifies the range of the indices 
that the clauses in the “Clauses” column are quantified over. For example, row (113) specifies that for all 
i € {1,...,7}, PBoundary contains the clause h1,;1. For the j € {0,T +1} boundary, PBoundary contains 
the following set of clauses. 


Indices Clauses 
tE {1,..., pe iE {1,...,7} j {0,T | 1}, k {1,2} Chij,k (119) 


Rows (113) and (114) ensure that at time t = 1, each tape has exactly one tape head, and it is located 
on cell j = 1. Row (115) ensures that at time t = 1, the state is given by the start state start. (Recall the 
notation x? to denote the literal x; if o = 1 and =x; if o = 0.) For the remaining rows, we recall that under 
the encoding of the tape alphabet, encr(L!) = 10 and encr (O) = 11. As a result, (i) row (119) ensures 
that for all times and tapes, the cells j € {0, T+ 1} contain the LI symbol, (ii) row (116) ensures that for 
time t = 1, no cell j  {0, T + 1} contains the [O symbol, and (iii) row (118) ensures that for time t = 1 


127 


and tapes 6 and 7, all cells į € {1,...,T} contain the U symbol. Finally, row (117) says that for tapes 
i € {1,...,5}, if cell j contains LI, then every cell j’ > j must contain LI as well. This means that the five 
strings encoded by c11, . . .,C1,5 each consist of a string of 0’s and 1s followed by a string of L!’s. In short, 
suppose we write n, x, y, a, and b for the prefixes of these strings with no L!’s. If PBoundary İS satisfied, then 
the execution tableau correctly encodes that the tapes of D contain inputs n, x, y, a, and b at time t = 1. 

Next, we describe the execution constraints. These are more complicated than the boundary constraints, 
and so we will begin by describing them in terms of a general Boolean circuit known as the local check 
circuit. For any time t € {1,...,T — 1} and tape positions j1,...,j7 € {1,...,T}, the local check circuit 
can check that the execution tableau properly encodes these tape positions at time t + 1 by looking only at the 
encodings of these tape positions and their neighbors (i.e. the tape positions j; + 1 for each i € {1,...,7}) 
at time t. 


Definition 10.11. In this definition, we will define the local check circuit Ccheck. It has the following inputs. 


e For each į € {1,...,7}, it has the eight inputs 


CCheck,0,i,—1 CCheck,0,i,0 CCheck,0,i,1 hcheck,0,i,-1 MCheck,0,i,0 MCheck,0,i,1 (120) 
CCheck,1,i,0 Acheck,1,i,0 


where the c-inputs are in {0,1}* and the h-inputs are in {0,1}. 
e It has two inputs SCheck,0, SCheck,1 E {0,1}". 


In addition, for each time t € {1,...,T — 1} and tape positions j,,...,j7 € {1,...,T}, we will define a 
circuit Coheck,tj1 4/7 by associating the inputs of CCheck with certain variables in V. We do this by associating 
the following eight inputs from V with the corresponding variables in Equation (120): 


Chiji1 Cbij,  Ctiji+1 hiij- Meij Prija 
Ct+Liji hitiij 


as well as by associating s; and s;+1 from V with SCheck,o and SCheck,1, respectively. 

We first define Ccheck as a function. In Proposition 10.12 below we give an implementation of this 
function as a circuit. To do so it is convenient to define a function Coheck,t,j1,..4)7 for each value of t, j1,..., J7. 
It will be clear that each of these is in fact the same function applied to different inputs, and hence this will 
define Ccheck as well. Now, the functions Coheck, tj ,.-.)7 perform the following computation. 


e Suppose for each i € {1,...,7}, exactly one of the tape positions j; — 1, j;, and j; + 1 at time t 
contains a tape head. First, Coheck, tj ,.-.f7 computes the transition function of the Turing machine 
applied to the contents of these 7 tape positions, when the state of D at time t is SCheck,o. This 
transition function produces the state of D and contents of these tape positions at time t + 1, as 
well as directions to move the 7 tape heads in. Then Ccheck,tyjp.-uf7 checks that the variables for the 
tape positions j;,...,j7 and the state of D at time ¢ + 1 match what they should be according to 
the computed transition function. In addition, if a tape head is marked as moving into a tape cell 
containing a L] symbol, that tape head remains in place instead. 


e Suppose for each i € {1,...,7}, none of the tape positions j; — 1, j;, or ji + 1 contain a tape head 
at time t. Then Coheck, tj ,.-4/7 checks that the variables for the tape positions j1,...,j7 at time t + 1 
match the variables at time t. 


e Otherwise, Ccheck,t accepts. 


lo-f7 


128 


This defines Ccheck,t,j;,...,i77 and therefore Ccheck- 


alar 
Definition 10.11 shows the utility of introducing the boundary variables c; 9 and c;; 741. The function 
Ccheck Checks the contents of a cell Csi, j at time t + 1 by looking at the cell and its neighbors Cy; j—1, Cti j+1 
at time t. However, those cells with j € {1, T} only have either a left neighbor or a right neighbor, and so 
without the boundary variables we’d have to introduce two other local check circuits designed just for these 
boundary cases. The boundary variables then allow us to use the same local check circuit for all cells. 


Proposition 10.12. A circuit Ccheck of size size at most poly(|D|) which computes the function described 
above can be computed in time poly(|D}). 


Proof. The circuit has 7-12 + 2 - K = 84 + 2x total Boolean inputs, giving a total of 2% . 4" = O(|K|?) 
possible input strings. For each possible fixed input string, the circuit CCheck checks if the actual input is 
equal to the fixed input, which takes O(x) gates, and then it accepts if the fixed input should be accepting, 
i.e. for each fixed input the correct output is hard-coded in the circuit. Doing so takes O(|K|* - x) gates. 
Computing Ccheck requires looping over all possible input strings and checking which ones are accepting 
or rejecting. This requires computing the transition function of D, a task which takes time poly(|D|). The 
proposition follows by noting that |K| < |D]. oO 


Proposition 10.13. Suppose that the execution tableau satisfies PBoundary and the circuit Coheck,t,j1,..u)7? for 
eacht € {1,...,T — 1} and ji,...,j7 € {1,...,T}. Then the execution tableau correctly encodes the 
execution of D on input (n, x,y,a, b) when run for time T. 


Proof. To show this, we will show for each time t € {1,...T} that the variables in the execution tableau 
corresponding to time t correctly encode the state of D and the contents of the seven tapes at time t. The 
proof is by induction on t. The base case of t = 1 follows from the tableau satisfying Pgoundary- 

Next we perform the induction step. Assuming the statement holds for time t € {1,...,T — 1}, we 
will show it holds for time t + 1 as well. Let i* € {1,...,7} be a tape, and consider a tape position 
ji € {1,...,T}. We will show that the variables correctly encode the contents of this tape position at time 
t + 1. Suppose one of the tape positions jf; — 1, jj«, or ji + 1 at time t has a tape head. For each of the other 
tapes i Æ i*, we select a tape position j; such that either j; — 1, j;, or ji + 1 has a tape head at time t. (These 
positions are guaranteed to exist since each tape has exactly one tape head.) 

By the induction hypothesis, for each tape i € {1,...,7} the variables corresponding to tape cells j; — 1, 
ji, and j; + 1 correctly encode the contents of these cells at time t. By assumption, CCheck,t,jı,..„jy evaluates 
to 1. In this case, it calculates the transition function of D to compute the contents of the tape cells j1,...,J7 
at time t + 1 and checks that the corresponding variables encode these contents. As a result, the tape position 
ji: On tape 1* is correctly encoded. In addition, it computes the state of D at time t + 1 and checks that the 
corresponding variables encode this state. This completes the induction step. The case when none of the 
tape positions j — 1, jj, and j; + 1 at time t contain a tape head follows similarly. Finally, the variables 
for all tape positions j € {0,T + 1} are correctly encoded due to PBoundary being satisfied. O 


Proposition 10.13 gives a set of constraints that ensure the execution tableau properly encodes the ex- 
ecution of D. Our next step will be to convert these constraints into a single 3SAT formula. This entails 
transforming each circuit CCheck,t,i,...i into a 3SAT formula. We do so using the following reduction. 


Proposition 10.14 (Circuit-to-3SAT). There is an algorithm which, on input a size-r circuit C on variables 
x € {0,1}", runs in time poly(r) and outputs a 3SAT formula ọ on variables x € {0,1}" and y € {0,1} 
with O(r) clauses such that for all x, C(x) = 1 if and only if there exists a y such that x and y satisfy 9. 


129 


Proof. This is the textbook circuit-to-3SAT reduction. For each gate i € {1,...,r}, the algorithm intro- 
duces a variable y; € {0,1}, so that the total collection of variables is Viota = {Xi }ie{1,...n} U (Witie{1,...r}- 
Consider gate i € {1,...,r}, and let z1,Z2 E€ Viotal be the pair of variables feeding into it. If gate i is an 
AND gate, then ọ includes the constraints 


(Az. Vaz: V yi), (41V22V-7yj), (z1 Y=z2V =y), (721 V z2 V 7Y;) - (121) 


These constraints are satisfied if and only if y; = Z1 A z2. The case of gate 7 being an OR gate follows 
similarly. Finally, g includes the constraint (y; ), where i* € {1,...,r} is the output gate. It follows that 
x,y satisfy ọ if and only if C(x) = 1 and for each i € {1,...,r}, y; is the value computed by gate i in 
circuit C on input x. In total, g has 4r + 1 clauses and is computable in time poly(r). 0 


We now apply the algorithm from Proposition 10.14 to Ccheck, which by Proposition 10.12 has size r < 
poly(|D|). This produces a 3SAT formula @check on the variables in Ccneck plus auxiliary variables a, for 
k € {1,...,r} added by the reduction. Now, for each t € {1,...,T — 1} and jy,...,j7 € {1,...,T}, we 
define a 3SAT formula @check,t,j;,...,j7 analogously to Coheck,t jinni We begin by associating those variables 
iN Pcheck Which come from Ceheck’s inputs with the variables in VY as in Definition 10.11. Next, for each 
k € {1,...,r}, we introduce a new variable a; j., j,k € {0,1} and associate it with the variable ay. This 
defines Pcheck,t,j,,...,7- In summary, the final 3SAT instance produced by the Cook-Levin reduction is 


p := PBoundary A ( VAN PERE Eady) . (122) 
bpp dz 


By Proposition 10.14, the execution tableau properly encodes the execution of D if and only if there exists 
a setting to the auxiliary variables satisfying the 3SAT formula g. In total, g contains 


|V| + O(T8) -r < O(T? + T- log(|D|) + T? - poly(|D|)) = poly(T, |D]) (123) 


variables. The second term on the left-hand side of Equation (123) corresponds to the auxiliary variables 
in each of the O(T®) copies of @check- The inequality follows by Equation (112) and the fact that r < 
poly(|D|). 

Our next task is to represent the 3SAT formula ¢ succinctly. To do this, we will provide a circuit C 
which succinctly describes a 3SAT formula @3sayz which, while not literally equal to g, will be isomorphic 
to it. This means that, for example, pasar may not even have the same number of variables as ọ, but any 
variable in ọ will correspond in a clear and direct manner to a variable in @3garT, and any remaining variables 
in @3sat do not appear in any clauses. This circuit is constructed as follows. 


Definition 10.15. In this definition we construct the circuit C. It has three inputs z1, Z2, z3 of length m, 
which we specify below in Equation (124), and three inputs 01,02,03 € {0,1}. For each v € {1,2,3}, each 
Zy is supposed to specify a variable in ọ according to a format we will now specify. If z, is not properly 
formatted, then it does not correspond to a variable in @; if any of z1, Z2, z3 is not properly formatted, then C 
automatically outputs 0. Below, we will often write substrings of the z,’s as though they are integers from 
some specified range, ie. a € {b,...,c}. This means that a is represented as a binary string of length 
[log(c +1) ], which is to be interpreted as the binary encoding of an integer between b and c. 

The input z, is formatted as a string (w, &, B1, P2, B3, p4). The first substring w has length m — (|| + 
|B1| +--+ + |64|) bits and is formatted to be the all-zeroes string. Its purpose is to pad the inputs to have 
the length m we specify below. Next, w is formatted as an integer « € {1,2,3,4}. The variable encoded 
by z, is specified by 6,, and the other three 6’s should be the all-zeros string. We now specify the encoding 
of By, conditioned on the value of w. 


130 


1. By is formatted as (t,i, j,k), where 
Pe {1,...,T}, PEt awl jE{0,1,...,T +1}, keE€{1,2}. 
This corresponds to the variable c; ; j,k- 
2. Bo is formatted as (t,i, j), where 
te {1,...,T}, ie {1,...,7}, 7 € {0,1,...,T+1}. 
This corresponds to the variable hiij. 
3. B3 is formatted as (t, k), where 
te {1,...,T} keEe{1L...,K} 
This corresponds to the variable s; ;. 
4. P4 is formatted as (t, j1,...,j7,k), where 
te {1,...,T-1}, ji,...,j7 € {1,2,...,T}, ke {1,...,r}. 
This corresponds to the variable 4; j, |. j,,k- 


In total, the length of these substrings is 
la] + [61| +--+ [Bal = O(log(T) + log(x) +1og(r)) = O(log(T) + log(|D])) . 
As a result, because |D| < øg, the length of the “padding” w can be chosen so that each z; has length 
m = O(log(T) + log(c)) . (124) 


Checking that each z, is properly formatted can be done by checking that certain substrings of z1, Z2, and 
z3 encode integers which fall within specified ranges. This can be done using O(m) gates. 

Having specified the inputs, we can now specify the execution of the circuit, and we may assume that 
Z1,Z2,Z3 are properly formatted. Implementing the clauses from Pgoundary İs simple; we specify how to 
implement the clause from Equation (113). 


e Suppose for input z1, «@ = 2. Then f2 can be parsed as (t, i, j). The circuit accepts if t = j = 1 and 
01 = 1, regardless of z2 and z3. This ensures that @3saq includes xz, V Xe V Ia for any z2, Z3, 02, 03, 
which is equivalent to including the arity-one clause x;,. 


This can be implemented with O(m) gates. Similar arguments can be used to implement the clauses from 
Equations (114)-(119) using O(m) gates apiece. 

Implementing the clauses from the formulas @check4,j,,...,j7 18 more challenging. From Equation (121), 
we can see that any constraint in this formula always involves a variable of the form 4; j,___ ;,, for some 


ke {1,...,r}. 


1. First check if one of its inputs z4, Z2, Z3 corresponds to such a variable. This can be done with O(1) 


gates simply by checking if for any of the z;’s, a = 4. If so, this specifies the values of t, f1, . . . , J7. 


131 


2. Each variable in @check,t,j,,...,i7 18 associated with a variable in PCheck- The circuit checks if all the z;’s 
are contained in @check,t,j;,...,7 and then it computes which variable in check they are associated with. 
We include below the example of checking whether z1 is associated with the variable Ccheck,1,1,0,0- 


e The circuit C first computes t + 1, which takes O(m) gates. It then looks at zı and checks 
if a = 1. If so, then 61 = (t',7’, j’,k’), and so it tests the equalities t = t+1, i = 1, f = ji, 
and k’ = 0, each of which takes O(m) gates to test. If so, then z is associated with Ccheck,1,1,0,0: 


There are poly(|D|) variables in @check and, for each variable z;, it takes O(m) gates to determine 
whether z; is associated with this variable. As a result, because |D| < ø, this takes poly(c,log(T)) 
gates to compute. 


3. Whether z1, z2, and zg share a clause in @check,t,j;,...,i7 depends only on which variables in Pcheck they 
are associated with. As a result, after having computed these variables, the algorithm can hard-code 
whether the circuit C should accept. 


This completes the description of C. In total, it contains poly (ø, log(T)) gates. Computing C first requires 
computing @check, Which takes time poly(|D|) < poly(c) by Propositions 10.12 and 10.14. After that, 
the steps outlined above for construction C take time poly(c, log(T)). 


The circuit C succinctly describes @ in the loose sense described above. Recall that g accepts if and 
only if it encodes the execution of D up to time T. We now modify C to (i) hard code n, x, and y onto D’s 
input tapes, and (ii) ensure that D accepts. We demonstrate how to do so by hard coding n as an example. 


e The input n is described by some string v of length £ = O(log(n)). We would like to hard-code the 
string (v, Lie *) into the first tape of D. To do this, we first check if zı corresponds to the variable 
Cri j,k for some values of t,i, j,k. Then, we check if t = 1, the time where the inputs appear on the 
tapes, and i = 1, corresponding to the first tape. If so, we branch on whether j < £. If it is, then if 
k = 1 the circuit accepts if 0; = 0, and if k = 2 the circuit accepts if 0; = v;. Otherwise, if j>4, 
then if k = 1 the circuit accepts if 01 = 1, and if k = 2 the circuit accepts if 0; = 0. This can be done 
with poly(log(n),log(T)) gates. 


This modifies ọ so that it only accepts if n is written on its first tape at input. We can similarly hard-code x 
and y onto the second and third input tapes and hard-code the accepting state as the final state of D. In 
total, this takes poly(log(7), |x|, |y|,log(T),a), which is poly(log(n), Q,log(T),a) because x and y 
have length at most Q. 

It remains to ensure that the variables corresponding to the 4th and 5th at time t = 1 are the lexicographically- 
first named variables in g. However, this is simple and can be done using poly (log(1), Q, log(T),) gates. 
This concludes the construction. 


10.3 A succinct 5SAT description for deciders 


Proposition 10.9 allows us to convert any decider D and inputs n, x,y into a 3SAT formula @3gar, Suc- 
cinctly described by a circuit C, which represents it. However, there are two undesirable properties of this 
construction, which we describe below. 


: g 0; 0; 0; ; : ; 
1. First, evaluating any clause (w i Vw i Vw he of @3gar requires evaluating the same assignment w 
at three separate points. While this is fine when the assignment w is provided in full to the verifier, 
it can be a problem when the verifier is only able to query the points in w by interacting with a 


132 


prover. In this case, the verifier might send the prover the values 11, 12,13, who responds with three 
bits by, b2,b3 € {0,1}, purported to be the values Wi Wi, Wi, for some assignment w € {0, M., 
As it will turn out, in the answer reduced protocol below, the verifier will actually be able to force 
the prover to reply using three different assignments. In other words, the prover will have three 
assignments Ww 1,W2,w3 € {0, iy such that, for any i1, 12,13 provided to it by the verifier, it will 
respond with bı = W1, b2 = W2,,b3 = W3;,. However, even given this there is no guarantee that 
the three assignments are the same assignments (i.e. that w1 = w2 = w3). In the work of [NW19], 
this was accomplished by an additional subroutine called the intersecting lines test, which would 
enforce consistency between w 1, W2, and w3. In this work, on the hand, we would like to relax the 
assumption that w 1, w2, and w3 must be the same. This will allow us to not use the intersecting 
lines test, simplifying the answer reduction protocol. (In fact, the answer reduced verifier will not be 
querying the assignments directly, but rather low-degree encodings of these assignments.) 


2. Second, we are guaranteed that w = (a,b,c) satisfies p3sar if and only if D accepts (n, x,y,a, b). 
However, it is inconvenient that a and b are contained as substrings of w. To see why, recall from 
Section 9 that the oracularized verifier sometimes gives one prover a pair of questions (x,y) and 
another prover just one of the questions—say, x. The answer reduced verifier will sample its questions 
similarly; as for its answers, it might expect the first prover to respond with a string w = (a,b,c) that 
satisfies g3gar and the second prover to respond with a string a’ such that a = a’. Verifying that 
a = a’ requires the verifier to sample a uniformly random point from w, restricted to the coordinates 
in a. As it turns out, generating a uniform point from a substring is extremely cumbersome, though 
not impossible, to do when the verifier’s questions are expected to be sampled from conditional linear 
functions. To remove this complication, though, we would instead prefer if the provers’ answers were 
formatted in a way that gave the verifier direct access to the strings a and b. 


In the remainder of this section, we will show how to modify the Succinct-3SAT circuits produced by 
Proposition 10.9 in order to ameliorate these two difficulties. Doing so entails modifying the @3gar formula 
from Proposition 10.9 to produce a 5SAT formula psgar. The clauses of @ssar will be of the form 

a V bi V wh. V Wei, V wg 
where a, b, w1, W2, and w3 are five separate assignments which are not assumed to be equal. The guarantee is 
that (a, b, w1, w2, W3) satisfies ssar if and only if D accepts (n, x, y,a, b). This addresses the two concerns 
from above: each clause is totally decoupled, meaning it samples 5 variables from 5 different assignments, 
and so no consistency check must be performed between the assignments. In addition, the first two strings 
exactly correspond to a and b, addressing the second item. 

We now formally define decoupled S5SAT instances and how they succinctly represent bounded deciders. 
Following that, we show how the succinct 3SAT instance 3547 produced by Proposition 10.9 can be modi- 
fied to produce a succinct 5SAT instance @ssar which represents D. 


Definition 10.16 (Decoupled 5SAT and its succinct descriptions). A block of variables x; is a tuple x; = 
(Xio,--+,Xi,n,-1)- A formula ọ on 5 blocks x1,x2,...,X5 of variables is called a decoupled SSAT formula 
if every clause is of the form 

Xia V an V XSi V ae V XB ig , (125) 


for i; € {0,1,..., N; — 1} and 01,...,05 € {0,1}. (Recall from Definition 10.2 that the notation x° means 
xifo = 1 and ~x if o = 0.) 


133 


For each i € {1,2,...,5}, suppose each N; is a power of two, and write it as N; = 2". Let C be 
a circuit with five inputs of length n1, n2,...,ns5 and five single-bit inputs. Then C succinctly describes 
decoupled @ if, for all i; € 101,..., N; — 1} and 01,02,...,05 € {0,1}, 


C(i,i2,...,i5,01,02,...,05) = 1 (126) 


if and only if the clause in (125) is included in ọ. As in Definition 10.2, we slightly abuse notation and 
use the convention that a number a between 0 and 2”: — 1 is interpreted as its binary encoding binary,,, (a) 
when provided as input to a set of n; single-bit wires. 


Definition 10.17 (Succinct descriptions for bounded deciders). Let D be a decider. Fix an index n € IN and 
atime T € N. Let L = 2° be a power of two that is at least as large as 2T. Let x and y be strings, r € IN 
and R = 2’. 

Consider a circuit C with two inputs of length £, three inputs of length r, and 5 single-bit inputs. Let pc 
be the decoupled 5SAT instance with two blocks of variables of size L and three blocks of size R which C 
succinctly describes. Then we say that C succinctly describes D (on inputs n, x, and y and time T) if, for 
all a,b € {0,1}4, there exists w1, w2, w3 € {0,1}* such that a, b, w1, W2, w3 satisfy ge if and only if there 
exist Aprefix Pprefix € {0,1}* of lengths £4, £p < T, respectively, such that 


L/2-£ L/2-£ 
a = ency (Aprefix » H Ph) and b= encr (bprefix, H /2— bs 


and D accepts (n, X, Y, Apretix, Dprefix) in time T. 


In this definition of succinct descriptions, the answers a and D are isolated, in that the first input of C of 
length £ indexes into a and the second input of length @ indexes into b. The next proposition shows how to 
construct such descriptions. 


Proposition 10.18 (Explicit succinct descriptions). There is a Turing machine SuccinctDecider with the 
following properties. Let D be a decider, let n, T, Q, and a be integers with Q < T and |D| < o, and 
let x and y be strings of length at most Q. Then on input (D, n, T, Q, o, x,y), SuccinctDecider outputs a 
circuit C with two inputs of length Lo (T), three of length ro(T, 0), and five single-bit inputs which succinctly 
describes D on inputs n, x, and y and time T. Moreover, the following hold. 


1. (T) = [log(2T)]. 
2. ro(T,c) = O(log(T) + log(c)), 
3. C has at most so(n, T,Q,c) = poly(log(T), log(n), Q, 0) gates, 


4. SuccinctDecider runs in time poly(log(T),log(n),Q,@), and the parameters lo, ro, So can be com- 
puted from n, T, Q,¢ in time poly (log(T), log(n), log(Q),¢). 


Proof. The Turing machine SuccinctDecider begins by running the algorithm in Proposition 10.9 on input 
(D,n,T,Q,0,x,y) to produce a circuit Casar on 3r9 + 3 inputs, where ro = O(log(T) + log(c)) is the 
parameter m from the proposition. Set lọ = [log(2T)], L = 2 and R = 2". Given this, SuccinctDecider 
returns the circuit C with inputs i4, i2 € {0,1,...,L — 1}, i3,ig,i5 E€ {0,1,...,R— 1}, 01,02,...,05 € 
{0,1}, and 

C(i1, i2,. . . 15,01, 02,. . 5705) = 1, 


134 


if one of the following conditions hold. 


Csat (i3, i4, 15,03,04,05) =1, 


(i, < 2T) A (i = i3) A (01 £03), 

(ip < 2T) A (i2 = i3 — 2T) A (02 £ 03) , 
(i, > 2T) A (i is odd) A (01 = 1), 

(i; > 2T) A (i is even) A (0; = 0), 

(i2 > 2T) A (iz is odd) A (02 = 1), 
(iz > 2T) A (iz is even) A (02 = 0), 
(is = i4) A (03 # 04) , 

(i4 = i5) A (04 £ 05) . 


It is not hard to verify that testing “(i1 < 2T)” can be done with O(f)) AND and OR gates, and testing 
(i2 = i3 — 2T) can be done with O(m) AND and OR gates. Using similar estimates for the remaining 
sub-circuits, we compute 


size(C) = size(C3gar) + O(ro + 40) = size(C3gar) + O(ro) 
< poly(log(T), log(n), Q, 7) + O(log(T) + log()) . 


In addition, due to the simplicity of these modifications, we conclude that the runtime of SuccinctDecider 
is dominated by the runtime of the algorithm from Proposition 10.9, which is poly(log(T), log(n), Q, 0). 

Now we show that C succinctly describes D on inputs n, x, and y and time T. To begin, we describe 
the decoupled S5SAT formula gc. Let us first consider the constraints in øe which are implied by the final 
constraint, i.e. those of the form 


ay V bi? V (w); V (wo); V (w3); 


whenever i4 = i5 and 04 Æ o5. For any fixed i4, i2, i3, the negations 01,02,03 can take any values, and 
as a result, the first three bits in the constraint vary over all assignments in {0,1}°. This means that these 
constraints are satisfied if and only if (w2);; V (w3); is satisfied whenever i4 = i5 and 04 Æ 05. This, in 
turn, is equivalent to the constraint w2 = w3. Carrying out similar arguments for the entire circuit, we can 
express the formula pç as follows. 


pc (a,b, w1, w2,W3) = P3sat(W1, W2, W3) A (w11 = a1) A (w12 = b1) 
A^ (m = (10)£/2-7) A (by = (10) A (wy = w2) A (wz = w3). 


Here, we write @3gar(W1, W2, W3) for the formula in which, for each constraint in Pagar, the first variable 
is taken from w4, the second from wp, and the third from w3. In addition, we write a = (a1,a2), where 
a, is the first 2T bits in a and a is the remaining L — 2T bits, and similarly for b = (b4, b2). We also 
write w1 = (W11,W1,2,W1,3), where w1, contains the first 2T bits in w1, wi, contains the second 2T 
bits, and w13 contains the remaining R — 4T bits. As a result, pç is satisfied only if wy = w2 = w3 = 
(a,,b1,c) for some string c € {0,1}%~4". In this case, calling w = (a1,b1,c), gc is satisfied only if 
P3gat(w) is. By Proposition 10.9, this implies that there exists a string Aprefix Of length La < T such that 
ay = encr (Aprefix, UT"). This, in turn, implies that 


a= (a1, az) = (encr (Aprefix, Ly) (10)*-2*) = encp (prefix py lta) , 


135 


using the fact that encr(L!) = 10, and similarly for b. Finally, Proposition 10.9 implies that D accepts 
(n, x,y, Aprefix, Dpretix) in time T. This completes the proof. O 


As stated above, moving from 3SAT to 5SAT allows us to devote the first two inputs to a and b. In 
addition, we have added extra constraints into g¢ which enforce that w; = w2 = w3, which means that we 
can relax this assumption on these assignments. 

We now show a simple transformation that takes in a succinct circuit C and outputs another succinct 
circuit C” whose 5 inputs are “padded” to contain more input bits. 


Proposition 10.19 (Padding). Let C be a circuit of size s with two inputs of length £, three inputs of length r, 
and 5 single-bit inputs. Suppose C succinctly describes D on inputs n, x, and y and time T. Then there is an 
algorithm which takes as input (C,¢,r,¢',1'), with © > € and r' > r, and in time poly(s, l',r') outputs a 
circuit C' with the following properties. First, C’ has two inputs of length ¢', three inputs of length r', and 5 
single-bit inputs, and its size is s + poly(£’,r’). Second, it succinctly describes D on inputs n, x, and y and 
time T. 


Proof, Write L = 2°,R = 2" and L’ = 2",R! = 2”. The algorithm constructs the circuit C’ which on 
inputs i, i2 E€ {0,...,L’ — 1}, i3,i4,i5 € {0,...,R’ —1}, and 01,...,05 € {0,1}, outputs 1 if and only if 
one of the following conditions hold: 

(i1,i2 < L) A (iz, i4,i5 < R) A C (it, i2, i3, i4, i5, 01, 02,03, 04,05) =1, 

(i; > L) A (i is odd) A (0; = 1), 

(i; > L) A (i1 is even) A (01 = 0), 

(iz > L) A (iz is odd) A (02 = 1), 

(i2 > L) A (iz is even) A (02 = 0) . 
Now we show that C’ succinctly describes D on inputs n, x, and y and time T. To begin, we describe the 
decoupled 5SAT formula gc’. The second and third constraints imply that for each 1; > L, gg contains the 
constraint (a; ) if i; is odd and (~a; ) if i; is even. Likewise, the fourth and fifth constraints imply that for 


each iz > L, pæ contains the constraint (b; ) if iz is odd and (—a;,) if iz is even. Thus, we can express the 
formula pç as follows. 


ger (a, b, w1, 2, W3) = Pe (a1, b1, W11, W21, 03,1) A (a2 = (10) E752) A (by = (10)-4)/2). (127) 
Here, we write a = (a1,a2) and b = (b4, b2), where a,b, have length L and az, bp have length L’ — L, and 
for each i € {1,2,3}, we write w; = (w;1,W;2), where w; has length R and wz has length R’ — R. 

Now, suppose there exist w1, W2, W3 such that a, b, w1, W2, W3 satisfy pe. Then because C succinctly 


represents D, there exists Aprefix, Dprefix Of lengths l4, lp < T such that D accepts (n, x,y, Aprefix, Dprefix)- In 
addition, 
ay = encp (prefix, Lia 
and likewise for b4. Equation (127) then implies that 
a = (a1,a2) = (encr (Apretix, pie he) Qe Ee) 


= (encr (prefix Ut/2-), ener (LI) —)/2) 


LE /2- ba) 


= encp (Aprefix 


where the third step used the fact that encr(U) = 10. As a similar statement holds for b, this establishes 
that C’ succinctly describes D on inputs n, x, and y and time T. O 


136 


By combining Proposition 10.18 and Proposition 10.19 (choosing # = r’ = ro where ro is from 
Proposition 10.18), we get the following: 


Proposition 10.20 (Explicit padded succinct descriptions). There is a Turing machine PaddedSuccinctDecider 
with the following properties. Let D be a decider, let n, T, Q, and o be integers with Q < T and |D| <, 
and let x and y be strings of length at most Q. Then on input (D,n, T, Q, 0, x,y), PaddedSuccinctDecider 
outputs a circuit C with five inputs of length m(T, 0) and five single-bit inputs which succinctly describes D 
on inputs n, x, and y and time T. Moreover, the following hold. 


1. m(T,c) = O(log(T) + log(a)) and 2” > 2T. 


2. C has at most s(n,T,Q,@) gates, where s is such that s(n,T,Q,7) = poly(log(T), log(n), Q, 0) 
and 5m(T,o) +5 + s(n, T, Q, 0) is a power of 2. 


3. PaddedSuccinctDecider runs in time poly (log(T), log(n), Q,), and the parameters m(T,o),s(n,T,Q, 0) 
can be computed from n, T, Q,@ in time poly(log(T),log(n), log(Q), 0). 


10.4 A PCP for normal form deciders 


We give a probabilistically checkable proof (PCP) for the Bounded Halting problem specialized to the case 
of normal form deciders. Our PCP will use standard techniques from the algebraic, low-degree-code-based 
PCP literature. In particular, we slightly modify the PCP for Succinct-3SAT described in [NW19, Section 
11] (which itself is based on the proof of the PCP theorem in [Har04]) to apply it to the decoupled Succinct- 
S5SAT instances described in Section 10.3. We follow their treatment closely. As the PCPs constructed in 
this section are only an intermediate object towards the normal form verifier introduced in the next section 
we do not include standard definitions on PCPs, and refer to these references (in particular [Har04]) for 
background. We begin with some preliminaries. 


10.4.1 Preliminaries 


A key part of the PCP will be to design a function f : F” — F which is zero on the subcube Hgypeube = 
{0,1}". Our next proposition shows that given such a function f, there is a way of writing it so that the fact 
that it is zero on Hgubcube is self-evidently true. Doing so involves showing that f can be written in a simple 
basis of polynomials which are constructed to be zero on the subcube. This fact is standard in the literature 
(see, for example, [BSS08, Lemma 4.11]), and we include its proof for completeness. 


Proposition 10.21 (Polynomial basis of zero functions). Let F be a field. Define zero : F — F as the 
univariate polynomial x ++ x(1— x). Define Hgubcube = {0,1}". Suppose f : F" — F is an individual 
degree d polynomial such that f(x) = 0 for all x € Hgubcube. Then there exist polynomials cy,...,Cn : 
F” — F such that for all x € F”, 


n 


f(x) = }_ ci(x) - zero(x;) . 


i=1 
In addition, for eachi € {1,...,n}, ci has individual degree-d. 
Proof. To prove this, we first prove the following statement for each k € {0,1,...,n}: there exists an 
individual degree-d polynomial rg : F” — F and polynomials c1, .. ., Cg : F” — F such that 


k 


f(x) = }_ ci(x) - zero(x;) + r(x) . (128) 


i=1 


137 


In addition, for each i € {1,...,k}, c; has individual degree d and the degree of x; in ry is at most 1. 

The proof is by induction on k, the base case of k = 0 being trivial. Now, we perform the induction 
step. Assuming that Equation (128) holds for k, we will show that it holds for k + 1 as well. Let rz be 
the polynomial guaranteed by the inductive hypothesis. We now divide rg by zero(x,41) using polynomial 
division. This guarantees a polynomial c,, and an individual degree-d polynomial r;, (x) such that 


re(x) = Chpa(x) : zero(xk41) + rep1(%) - 


In addition, Ck, still has individual degree d (and in fact the degree of xX,41 in Cy41 is at most d — 2). 
Furthermore, for each į € {1,...,4 +1}, rk+1 has degree at most 1 in x;. Plugging this into Equation (128), 
we see that 


k+1 
f(x) = 3 ci(x) - zero(x;) + rk41 (x) . (129) 


This completes the induction. 
Applying the k = n case of Equation (128), we see that 


f(x) = 2 -zero(x;) +r(x), (130) 


i=1 


where the degree of each variables in r is at most 1. Since f(x) vanishes on the subcube Hgybcube, it must 
also be that r(x) vanishes on Heybcube as Well. We claim that r must therefore be the zero polynomial. 

We prove this by showing the following statement: for every polynomial s(x1,...,X;,) with individ- 
ual degree 1 (i.e. a multilinear polynomial) that vanishes on all points x € {0,1}", s must be the zero 
polynomial. We show this by induction on the number of variables. 

Consider the base case n = 1. Then s is a univariate polynomial with degree at most 1. However it is 
zero on two points, so therefore it must be the zero polynomial. Now for the inductive step: assuming the 
proposition holds for some n > 1, we show that it holds for n + 1 as well. For any multilinear polynomials, 
we can write 

s(X1,..-,Xn41) = Rad (Pipes Xn) + S2(X1,.--, Xn) 


where both s; and sp are n-variate multilinear polynomials. Fix x,;, = 0. Then s(x1,...,Xn,0) = 
S2(X1,.-., Xn). Since s vanishes on {0,1}"*!, this implies that sz vanishes on {0,1}" as well. By the 
inductive hypothesis, s is the identically zero polynomial. Now fix X„+}1 = 1. Then s(x1,...,Xn,1) = 
s1(X1,.-., Xn). Again, since s vanishes on {0, 1}”+1, this implies that s4 vanishes on {0, 1}” as well. Again 
by the inductive hypothesis, sı is the identically zero polynomial. This shows that s is the identically zero 
polynomial, completing the induction. 

Thus, r is the zero polynomial. Applying this fact to Equation (130), we arrive at the statement in the 
proposition. O 


10.4.2 The PCP 


The problem. The input to the PCP verifier is a tuple (D, n, T, Q, 0, y, x,y). Here, D is a decider, n, T, 
Q, o and y are integers with Q < T and |D| < ø, and x and y are a pair of strings of length at most Q each. 
The goal of the verifier is to check whether there exists two strings Aprefix and Dprefix of length at most T 
such that D halts on input (n, x,y, Aprefix, bprefix) in time at most T. To do that the verifier makes random 
queries to a specially encoded PCP proof I1, and decides whether to accept or reject based on the parts of 
I that it reads. The proof II is encoded using a low-degree code whose parameters are controlled by y: the 


138 


larger y, the smaller the soundness error in testing the code. We first set the parameters used in the PCP 
construction. 


Definition 10.22 (Parameters for the PCP). For all integers n, T, Q,7,y € N such that Q < T and |D| < 
o define the tuple pcpparams(n,T,Q,0,7) = (q,m,d,m',s) as follows. Let m = m(T,c), ands = 
s(n,t,Q,a) be as in Proposition 10.20. Let a’ > 1 and 0 < b’ < 1 be the universal constants from 
Theorem 7.8. Define the following integers. 


1. Let m’ = 5m +5 + s (recall that by Proposition 10.20, s is chosen so that m’ is a power of 2). 


2. Letq = 2* where k is the smallest odd integer satisfying the following: 


(a) k> (w + 3a’) /v') logs. 
(b) (2+ 5k)m! /2* < 1/2. 
(c) km! /2* < 5-7, 
(d) 2* is divisible by m’. 
3. Letd =k. 
Given n, T, Q,0, y represented in binary, the parameter tuple pcpparams(n, T, Q, ø, y) can be computed in 
time poly (log (1), log(T),log(Q), log (7), log(¢)). 
Next, we define the format of a valid PCP proof, which for our construction consists of evaluation tables 


of low-degree polynomials. 


Definition 10.23. Given n,T,Q,0,7 € N and (q,m,d,m',s) = pcpparams(n, T, Q, ø, y), a low-degree 
PCP proof is a tuple II of evaluation tables of polynomials 91,...,25 : FF —> F; and co,...,Cm' : F” — 
IF, with all polynomials having individual degree at most d. We divide the m’ input variables of co,.. . , Cm’ 
into blocks as follows: 
m' =_ 
F; Əz=( X peony X, 0, wW). 
Fm F F; F3 

Definition 10.24. Given a low-degree PCP proof II and a point z = (x1,...,%5,0,W) € F”, where 
X1,-..,%5 © F”, 0 € I? and w € F3, the evaluation of II at z is given by 


eval- (IT) = (a1,...,&5, Bo,- --, Bw) € p , 
where «; = g;(x;) and B; = c;(z). 
Theorem 10.25. There exists a Turing machine M ag with the following properties. 
1. (Input format) The input to M ar consists of two parts: a “decider specification” and a “PCP view.” 


(a) (Decider specification) Let D be a decider, n, T, Q, o and y be integers with Q < T, and © = 
|D|, and let x and y be strings of length at most Q. Let (q,m,d,m',s) = pcpparams(n, T, Q, 0, y) 
be as in Definition 10.22. Then the corresponding decider specification is the tuple 


(D,n,T,Q, 0, Y, x,y). 


139 


(b) (PCP view) Letz € Fe and let 2 € ne, Then the PCP view is the pair (z, 2). 
The Turing machine M ap returns either 1 (accept) or 0 (reject). 
2. (Complexity): M ar runs in time at most poly (log(T), log(n), Q,7, y). 


For the remaining items, fix a decider specification (D,n,T,Q,0,Y,%, y); we think of M ar as a function 
of the PCP view input only. 


3. (Completeness): Suppose Aprefix, bprefix € {0,1}* are two strings of length Lla, ly < T, respectively, 
such that D halts and accepts in time T on input (n, x,y, Aprefix, bprefix)- Let M = 2” and write 


Leese) Lyre ee) 


a = encr (Aprefix , and b= encr (bprefix , 


1 


where recall that by Proposition 10.20, m is chosen so that 2T < M, and U is a special symbol that 
is encoded using two bits; thus a and b are each an M-bit string. Then there exists a low-degree PCP 
proof (Definition 10.23) TI = (g1, ..., 85, Co, .- .,Cm') with Q941 = Qa and g2 = Qp, the low-degree 
encodings of a and b, respectively (see Section 3.4), which causes M ar to accept with probability 1 
over a uniformly random z € ae 


Pr (Mar(z,eval,(II)) =1) =1. 


m! 
zeF; 


4. (Soundness): Let II = (1,.--,85,Co,-++,Cm') be a low-degree PCP proof such that Map at a 
uniformly random z accepts with probability larger than Psound = 5: 


Pr (Mar(z, eval, (I1)) = 1) > Psound - 


m! 
zeFy 


Then there exist strings a,b € {0, 1 with the following properties. 
(a) There exist strings Apretix, Dprefix € {0,1}* of length l4, lp, respectively, such that 


(geet) (grate) f 


a= encr (Aprefix , and b= encr (bprefix ; 


(b) D halts in time T and accepts on input (n, X,Y, Apretix Uprefix)- 


(c) a = Dec(g1) and b = Dec(g2), where Dec(-) is the Boolean decoding map of the low-degree 
encoding defined in Section 3.4. 


It is important to note that the soundness in Theorem 10.25 is only against low-degree proofs, which 
is why we are able to obtain a soundness error independent of y. The eventual answer-reduced verifier 
(Theorem 10.27) will combine this PCP with the low-degree test, and the latter will introduce a dependence 
on ¥ in the soundness error. 


Proof of Theorem 10.25. We present the description of the Turing machine M ar in Figure 13. 
Before establishing the Complexity, Completeness, and Soundness properties of Mar, we point out 
several important items. Recall that M = 2”. 


140 


The Turing machine M ap takes the following as input: 
e Decider specification (D, n, T,Q,7,7,%,Y) 
e PCP view (z,&) 
The Turing machine performs the following steps sequentially: 


1. Compute C = PaddedSuccinctDecider (D, n, T, Q, 0, x,y). The circuit C has five m-bit 
inputs and five single-bit inputs, and contains at most s AND and OR gates. 


2. Compute the Tseitin formula F (Definition 10.4) corresponding to C, which is a boolean 
formula on m’ = 5m + 5 + s variables. Let Fy, itn : F — F; denote the arithmetization 
of F (Definition 10.5). 


3. Parse z = (x,o,w) € Re where x1,..., x5 E€ Fj, o € Es and w € F}. (Here 
x is a variable which should not be confused with an input x to D.) Parse & = 
(a1, aoe ., 4&5, bo, E Bm) E€ E 


4. (Formula test) Reject if Bo A Faritn(x,0, wW) - (&1 — 01) --- (&5 — 05). Otherwise, con- 
tinue. 


5. (Zero on subcube test) Reject if By A X}! ] B; - zero(z;). Otherwise, accept. 


Figure 13: The decision procedure M ar. 


1. We recall what it means for the circuit C computed in Step 1 of Figure 13 to succinctly describe 
the 5SAT formula gc. Let the inputs of the formula pc be strings a, b, uz, u4, u5 € {0, 1M. Then 
for all x1,...,x5 € {0,1}™ ando € {0,1}°, we have that C(x1,...,xX5,0) = 1 if and only if 
ax V b2 V Usg, V Use, V Us. is a clause in gc, where the coordinates of a, b, u3, U4, us are indexed 
by strings of length m. 

Furthermore, the formula @ is related to the decider D in the following way. For all a,b € {0,1}™, 
there exist u3, u4, u5 € {0, 1M such that a,b, u3, u4, U5 satisfy @c if and only if there exist Aprefix, bprefix E€ 
{0,1}* of lengths 24, & < T, respectively, such that 


[M2s eect) 


a= encr (Apretix, and b= encr (Dprefix, 
and D accepts (n, x, Y, Aprefix, Dprefix) in time T. 


2. We recall from Definition 10.4 that the boolean formula F is related to C in the following way. For all 
x1,--.,x%5 E€ {0,1}™ ando € {0,1}°, C(x1,...,%5,0) = 1 if and only if there exists a w € {0,1}5 
such that F(x,,...,%5,0,w) = 1. The size of the formula F is linear in the size of the circuit C, 
which is s. 


3. We recall from Definition 10.5 that the arithmetization Farin := arith; (F) is a function Farith : 
Fe — IF, such that 


V(x,0,w) € (oa , Pace 0, w) = F(x,0,w) . (131) 


141 


By Proposition 10.6, Frith is an individual degree 2 polynomial. 


Complexity. We bound the complexity of the Turing machine M ar. From Proposition 10.20, Step 1 takes 
time poly(log(T), log(n),Q,a). Computing the Tseitin formula in Step 2 takes time that is polynomial 
in the size of the circuit C, which is poly(s) = poly(log(T),log(n),Q,c). The Formula Test, Step 4, 
requires computing the arithmetization Frith at a point z € F”, which takes time poly(s,logq). The 
Zero on Subcube Test, Step 5, takes time poly(m’, log q) to compute a sum of m’ products of F; elements 
(evaluating zero(-) takes time poly (log q)). Thus the complexity of Mar is bounded by 


poly(log(T),log(1), Q,0,7) 


using our choice of pcpparams from Definition 10.22. (The dependence on y comes form the setting of q.) 


Completeness. We now establish the Completeness property of Mar by concocting a PCP proof IT that 
is accepted with probability 1. 

Let a,b € {0,1}™ be as specified in Item 3 of Theorem 10.25. Then by definition of the formula ge 
and the assumption that D halts and accepts in time T on input (n, x,y, Aprefixs bprefix)» there exist strings 
u3, U4, U5 € {0,1}™ such that (a, b, u3, u4, Us) satisfies pc. 

Let 21,22 denote the low-degree encodings of a and b, respectively (see Section 3.4 for definition of 
low-degree encoding), and let 93, 94, g5 denote the low-degree encodings of u3, u4, and us, respectively. In 
particular, g1, . . ., g5 are m-variate multilinear polynomials, and for all x1,...,%5 € {0, 1y”, 


91(%1) = ax, 82(%2) = bx, 93(%3) = U3x;, ga(x4) = Uae, g5(xX5) = Use, , 


where we index the coordinates of a, b, u3, U4, U5 by m-bit strings in the natural way. 
Next, define the polynomial co : Fr — F; as 


co(x,0,W) = Faritn(X,0,W) + (g1 (x1) — 01) +++ (g5(x5) — 05) . 


Note that co has individual degree 3, since as recalled in item 3 above Farin (x, 0, w) has individual degree 
2 and each of the g;’s are multilinear). 

We now show that co(x,o,w) = 0 for all (x,0,w) € {0, 1”. By construction, the arithmetization 
Farith(x,0, w) is either 0 or 1. If it is 0, then we are done. If it is 1, then this means that F (x,0,w) = 1, 
which by definition means that C (x,0) = 1, which means that azi vb% V uzy V Ure, V Use. is a clause in 
the formula ge. But since (a,b, u3, U4, U5) satisfies pe, this means that at least one of Ay, = 01, by, = 02, 
U3,x, = 03, U4 x, = 04, OF U5,x, = 05. But this means that the product (g1(x1) — 01) ++ (g5(x5) — 05) = 0, 
by definition of the g;’s. Thus co(x,0,w) = 0. 

Thus, by Proposition 10.21, there exist polynomials c41, . . . , Cw : Re — IF, of individual degree at most 
3 that certify that co vanishes on {0,1}. In other words, for all z € F7 (not just over the boolean cube), 
we have 


co(z) = 2 c(z) - zero(z;) (132) 
where zero(x) = x(1 — x). 


Let the PCP proof IT be the collection of polynomials ( Giroen B57 C0, Cly eey Cm), where the grs act on 
disjoint variables and the c;’s act on all of them. Let z € Ee and let E = (&1,.. ., &5, Bo, Bis- -< Bmw) = 


142 


eval, (II). The PCP view (z,&) passes the Formula Test always, because By = co(z) and a; = g;(z) for 
i € {1,2,...,5}. The PCP view also passes the Zero on Subcube Test, because of Equation (132). 

Thus the PCP view is accepted by the Turing machine M ar with probability 1. This establishes the 
Completeness property. 


Soundness. Fix a low-degree PCP proof II = (91,...,85,Co,--+,Cm’) Such that the PCP view (z,&) for 
= = eval, (II) causes Map to accept with probability greater than P.oung- This, in particular, implies that 
the Formula Test is satisfied with high probability. With probability greater than Peounq over the choice of 
Z~ B we have 


co(Z) = Farin (Z) - (81 (%1) — 01) +- (85(x5) — 05) 


Note that co(z), by definition of low-degree PCP proof, has individual degree at most d. The right-hand 
side has individual degree at most 2 + 5d, because Farith has individual degree 2 and the g;’s have in- 
dividual degree at most d. Thus both sides have total degree at most (2 + 5d)m'. Suppose co(z) was 
not identical to Faritn(Z) - (@1(%1) — 01) +- (g5(%5) — 05). Then the Schwartz-Zippel lemma implies 
that the probability they agree on a randomly chosen z is at most (2 + 5d)m'/q. By our choice of g, 
m', and d in Definition 10.22, this is less than Pgoung, Which is a contradiction. Thus co(z) is equal to 
Farith(Z) « (91(%1) — 01) +- (¢5(%5) — 05) for all z € F”. 

Next, we consider the Zero on Subcube Test. With probability greater than Psouna over the choice of 
Zw ae we have 


co(z) = ð c;(z) - zero(z;). 
i=1 
Again, the polynomials on both sides of the equation have individual degree at most d + 2, and therefore 
total degree at most (2 + d)m’. By the Schwartz-Zippel lemma, the probability that both sides would agree 
on the evaluation of a random z, if they weren’t equal polynomials, would be at most (2 + d)m'/q, which 
is also less than Psoung. Thus co(z) = 0%, c;(z) - zero(z;) for all z € Fr. This in particular implies that 


co(z) vanishes on the subcube {0,1}. 

We can now decode assignments a,b, u3,U4,U5 €E {0,.1)" that satisfy the formula ge. Leta = 
Dec(g1),b = Dec(g2),u3 = Dec(g3),u4 = Dec(g4),us = Dec(gs) where Dec(-) is the decoding 
map defined in Section 3.4. The fact that co(z) vanishes on the subcube {0, 1” implies that all clauses of 
Qc are satisfied by a, b, u3, u4, u5. By construction of pc, this implies that there exists Aprefixs Dprefix such 
that D(n, x,y, Aprefix, bprefix) accepts in time T. This completes the proof of the Soundness property, and 
the proof of the Theorem. O 


10.5 A normal form verifier for the PCP 


In this section we show how to convert the PCP from Section 10.4 into a normal form verifier. This results 
in an “answer reduction” scheme: a way to map a verifier V into a new (typed) verifier VAR with a smaller 
answer size. 

Let V = (S, D) be a normal form verifier and (A, y, y) a tuple of integers. In the rest of this section we 
define the (typed) answer-reduced verifier PAR = (S aR DAR) associated with V and parameters (A, y, y). 
Completeness, complexity and soundness of the construction are shown in the following sections. 


143 


10.5.1 Parameters and notation 


We establish some notation and parameters that are used throughout Section 10.5. First, define the functions 
T(n) = (2™)” and Q(n)= (An)" . (133) 

In the main theorem of this section, Theorem 10.27, we assume that the input verifier V satisfies 
TIMEp(n) < T(n) and TIMEs(n) < Q(n). (134) 


In what follows we write T and Q as free parameters (though they are implicitly functions of the index n). 
Next, for all integers n € IN define the PCP parameters (q, m, d, m',s) = pcpparams(n, T, Q, 0, y) 
where 7 = |D|. Note that the parameters (q, m, d, m', s) are all implicitly functions of n. 
Recall from Definition 4.16 the notation us denoting the distribution over pairs of questions (x4, xp) 
generated by S. We use is, to indicate the marginal distribution of us on the first question xa, and 
SUPP(}ig) to indicate the set of question pairs that have nonzero probability under ps. 


Remark 10.26. Throughout Section 10.5, for convenience we often identify the label A with 1 and B with 
2. For example, ga is another label for the polynomial g4. 


10.5.2 The answer-reduced verifier 


Let V = (S,D) be anormal form verifier and (A, u, y) integers. All other required parameters and notation 
are introduced in Section 10.5.1. 


Sampler. In this section we define the (typed) sampler SAR for the (typed) answer reduced verifier ÑAR, 

We first give an intuitive description of the sampler, and then proceed to give a formal definition. The 
sampler SAR isa product of the oracularized sampler ORAC corresponding to S, and a typed PCP sampler 
SPCP_ The corresponding question distribution y4 GAR is then a product distribution 4 orac X Hgpcr- 

The PCP distribution }1 ¢pcp corresponds to six copies of the classical low-degree test question distribu- 
tion (see Section 7.1.1). Recall that the classical low-degree test is a nonlocal game that checks whether 
the players’ answers are consistent with low-degree polynomials. The answer-reduced verifier will use 
the classical low-degree test to check that the players respond with evaluations of low-degree polynomials 
Q1,- - -, 85, CO; - - - , Cmr, and then process the evaluations using the PCP verifier M ar from Theorem 10.25. 
Five of the six copies are meant to certify the “low-degreeness” of 91,...,@5, and the sixth copy is used to 
certify the “low-degreeness” of the entire bundle of polynomials 8&1, . . . , 85, Co, - - - , Cm. The reason that the 
grs are checked individually is to ensure that they only depend on certain blocks of input variables, because 
ultimately the soundness of the answer-reduced verifier reduces to the soundness of the PCP verifier M ar, 
and Theorem 10.25 assumes that the polynomials g1, . . . , 95 of the PCP proof only depend on certain subsets 
of variables. 

We now formally define the typed PCP sampler SPCP The type set is 


TPCP = {PoOINT],...,POINTs} U {ALINE},..., ALINE¢} U {DLINE},...,DLINE¢}, 
and the type graph GPC? = (T PCP, EPCP) uses the complete edge set EPOP = TPCP x TPCP. The ambient 
vector space for the sampler is 


5 


ie (a Vix ® Vir 4 Viv) BD Vaux,x © Vaux ® Vaux,v » (135) 
i=1 


144 


where the spaces V;x,Vj;y are each isomorphic to F7- the space V;r is isomorphic to F}, the spaces 
Vaux,x, Vaux,v are each isomorphic to Ep, and the space Vaux, is isomorphic to F}. In addition, de- 


fine the following direct sums: 


oa 


Ve,x = (® Vix) ® Vaux,X , 


5 


Vo, = (® Vi) D Vaux, , 


5 
Vo,v = ($ Viv) D Vaux,v $ 
i=1 


For alli € {1,2,...,6}, we call the space V; x the i-th point register, the space Vj the i-th coordinate regis- 
ter, and the space V; y the i-th direction register (compare this with the subspaces defined in Section 7.1.1). 
For every type t € 7 PCP, we define the following CL functions on VPP: 


e For the types t = POINT; fori € {1, EA 6}, the corresponding 1-level CL function Lporr; is 
identical to the CL function Lpomr defined in Equation (48), but acts on the subspace V; x ® Vi1 ® Viv 
(and zeroes out all the other registers). 


e For the types t = ALINE; fori € {1,...,6}, the corresponding 2-level CL function Laing, is 
identical to the CL function Laying defined in Equation (49), but acts on the subspaces Vj x © Vi ® 
Viv- 


e For the types t = DLINE; fori € {1,...,6}, the corresponding 3-level CL function Lpzing, is 
identical to the CL function Lpy ine defined in Equation (51), but acts on the subspaces Vj x © V; ® 
Viv. 


Observe that fori € {1,...,5}, the CL distributions fiz... Lom, 24 ULpie. Leow, COrrespond to the 
question distributions of the classical low-degree test parameterized by g and m (see Section 7.1.1 for the 
definition of the low-degree test distributions, which only depend on the first two arguments of the 4-tuple 


Idparams), and the CL distributions Hi) isc¢Lrowre 24 Mipiineg correspond to the classical low-degree 


Lpowt, 
test parameterized by q and m’. 
We finally give the formal definition of the sampler ^P. The type set of S48 is TAR = TORS x JPOP 


and the type graph is GAR = GOR4C x GPCP with edge set 
EAR = {{(u,v),(u/,v')}: {uu} € EO and {v,v'} € BPCP} , 


Let VO®* denote the ambient space of the oracularized sampler ORAC, The ambient space of SAF is then 
the direct sum VRAC @ VPCP. For all type pairs (torac, tpcp) € 7%, the corresponding CL function 
Ltorac,tpcp 1S Simply the direct sum of the CL functions L;,,,. (coming from SORAC) and Lipcp (Coming from 
S RORY. Thus one can see that the distribution corresponding to SAR is the product distribution orac X 
Herce. 

Note that SAR is an max{£,3}-level sampler, where £ is the number of levels of the sampler S. 


Decider. The decider DA® is described in Fig. 14. 


145 


Type Question Format Answer Format 


POINT; fori=1,...,5 ME Fy a; € Fy 

ALINE; fori = 1,...,5 v; € Fj’ x Fy h; : Fa > Fg 

DLINĘ; for i = 1,...,5 v; € Fj’ x F4 x F7 Mg e Eg = Pa 

POINT6 z= (y,0, WIE (heorien) E ey. +6 
ALINE, vE Eg x F, l (higos sollo jiye s eadun) ig => P 
DLINE6 oE Fy x F, x Fy (ine scollo hossain) 8 Eg = PE e 


Table 1: Question and answer formats for types in T°. 


On input (n, tA, XA, tB, XB,4A, apg), the decider DAR parses ta, tg as (toa tm,a), and (to B, trp) 
respectively in ORAC x TPCP. parses xq and xp as (XQ,4,X11,a) and (xọQ,B, XB) respectively. The 
answer format depends only on tyyz,q and tryg respectively and is as indicated in Table |. The decider 
performs the following steps sequentially, for all w € {A,B}: 


1. (Global consistency check): If ta = tp, reject if a, Æ ag. 


2. (Input consistency check): If tg, = ORACLE and tom = v € {A,B}, and if (trw, trm) = 
(POINT6, POINTo ), reject if a» 4 «1, (where A + 1 and B ¢ 2, as per Remark 10.26). 


3. (Input low degree test) If tg, = tom = v € {A,B}, and if (tnw tmm) 
(POINT,, ALINE) (resp. DLINE, instead of ALINE), execute Dierense on input 


(POINT, X11, ALINE, X10, 4w, 4m) (resp. DLINE instead of ALINE), where Idparams = 
(q,m,d,1). Reject if DEP rejects. 


Idparams 


4. (Proof encoding checks): If tow = tgm = ORACLE, 


(a) (Consistency test) If (trw, tmm) = (POINT;, POINT.) for some i € {3,...,5}, reject if 


Re $ i. 

(b) (Individual low degree test) If (tr, tnm) = (POINT;, ALINE; ) (resp. DLINE;) for some 
i € {3,...,5}, execute DNaparams on input (POINT, xtq 7, ALINE, xry, 4w, 4m) (resp. 
DLINE). Reject if DD Gnaratis rejects. 

(c) (Simultaneous low degree test) If (tr,w,tr,m) = (POINT, ALINE6) (resp. DLINE¢), 
execute Dia. ee On input (POINT, xy, w, ALINE, xm, 4w, am) (resp. DLINE), where 
Idparams’ = (q,m',d,m' + 6). Reject if DLP , rejects. 


Idparams 


5. (Game check): If tg, = ORACLE, then for v € {A,B}, compute xw p = L°(xQw). If trw = 
POINTg, reject if Mar((D,n,T, Q, Y, Xw,a, Xw,B), (Z,aw)) rejects. Otherwise, accept. 


Figure 14: The decision procedure DAR. Parameters T, Q,q,m,m',d,¥ are defined in Section 10.5.1. 


10.5.3 Main theorem for answer reduction 


Theorem 10.27. There exists a polynomial time Turing machine ComputeAnsRedVerifier that, on input 
(V,A, u, y) where V is an €-level normal form verifier and A, u, y € N, outputs the description of the 
answer reduced verifier VAR = (SA®, DAR). Let T(n) and Q(n) be the functions specified in Eq. (133), 


146 


and suppose that V satisfies the complexity conditions specified in Eq. (134). Then VA is max{ £ + 2,5}- 
level, has time complexity bounds 


TIME sar (n) = O(TIME gorac(1) + poly(log(T(n)),|D|,v)) = poly((An)", |D], y), 
TIMEpax (n) = poly(log(T(1)), Q(n), |D], y) = poly((An)", |D], Y), 


and furthermore the sampler S®È only depends on S, the parameter tuple (A, {, y), and the description 
length |D| (but nothing else about D). Moreover, there exists a function 


5(e,n) = ay"((A-|[D]-n)™-e' + (A-|D|-n) 7) 
for universal constants a > 0,0 < b < 1, such that the following hold for all n > 2: 


1. (Completeness) If V, has a projective, consistent, and commuting (PCC) strategy of value 1, then 
VAR has an SPCC strategy with value 1. 


2. (Soundness) If val* (VR) > 1 — e for some e > 0 then val*(V,) > 1— (e,n). 
3. (Entanglement) Let &(-) be as defined in Definition 5.12. Then for all e > 0, 
& (VAR 1 — e€) > 6(Vyn,1— 6(e,n)) . 
Observe that TIMEpar(n) is polynomial in the logarithm of T(n), the runtime of the decider of V, 
achieving the desired reduction in runtime and answer complexity. 
Proof. The Turing machine ComputeAnsRedVerifier can be described as follows: 


1. From the description of V, compute the description of the (typed) oracularized verifier yore — 
(SORA, NORAC) using the Turing machine ComputeOracleVerifier from Theorem 9.1. 


2. From the description of the typed sampler SORAC and the parameter tuple (A, u, y), compute the 
descriptions of the typed sampler S“® as described in Section 10.5.2. 


3. From the descriptions of the decider D and the parameter tuple (A, u, y), compute the descriptions of 
the typed decider DAR as described in Figure 14. 


4. Compute the detyped verifier VAR = (SAR, DAR) = Detype(VAR). 


The Turing machine ComputeAnsRedVerifier takes time that is poly(|V |, log A, log ju, log y), because each 
step, including the detyping procedure, runs in time that is polynomial in the length of the input (V, A, u, 7). 


Properties of the sampler S48. It can be verified via inspection that the typed sampler GAR depends on 
SORAC (which itself only depends on S; see Theorem 9.1), and SPC? (which only depends on the parameter 
tuple (A, u, y) and the description length |D|). Using Theorem 9.1, S°®4C is a ¢-level typed sampler and, 
therefore, SAR is a max{,3}-level sampler as SPO isa typed 3-level sampler. Using Lemma 6.18 for the 
detyping it follows that SA® is a max{ + 2,5}-level typed sampler. 


147 


Complexity. In addition to the running time of SORAC, the time complexity of SAR also includes the run- 
ning time of S PCP which is dominated by the complexity of computing the CL functions Lpoinz,, LALINE;» 


and Lpzing;- From Lemma 7.9, this takes time 
poly(m', log q) = poly(7,s) = poly(log T, logn, Q, |P|, 7) = poly(log T, |D], y), 
where the last equality uses the formulas (133). Therefore the time complexity of SAR is 
O (TIME gorac (n) + poly(log T(7),|D|,7)) - 


The complexity of SAR = Detype(S AR) then follows by Lemma 6.18 and Equation (134). 

The decider DA® executes subroutines DL? and Mar. The runtime of DY is poly(m,d,m', log q). 
The runtime of M ar is given in Theorem 10.25; for T and Q as in (133) itis poly(log T, log n, log q, Q, |D|). 

In addition to theese subroutines, DA computes the CL functions Lê and L® of the oracularized sam- 
pler ORAC in Step 5 of Figure 14. By Theorem 9.1, the complexity of ORAC is polynomial in the time 
complexity of S, which by Equation (134) is at most Q(n). 

Thus the overall time complexity of DAR, using both the PCP parameter settings of Definition 10.22 
and the assumptions of Equation (134), is poly((An)", |D|, y). The complexity of Detype(D4) follows 
by Lemma 6.18. 


Completeness, Soundness, and Entanglement. The Completeness property is established in Section 10.6 
and the Soundness and Entanglement properties are proven in Section 10.7. O 


10.6 Completeness of the answer-reduced verifier 


We establish the Completeness property of the answer reduced verifier as stated in Theorem 10.27. 


Proof of the Completeness part of Theorem 10.27. We establish the Completeness property of VAR which 
by the Detyping Lemma (Lemma 6.18) establishes the Completeness property of VAR, 

Let n > 1 be an index for VAR, Let Z be a PCC strategy for V, with value 1. By Theorem 9.1 it 
follows that there exists a symmetric PCC strategy .7°8° using a state |) with value 1 for PORAC, We 
define a strategy ^R for the typed verifier ye as follows. The shared state is |). Given the index n 
and (A, u, y), each player can compute (q,m,d,m',s) = pcpparams(n, T, Q, g, y) (see Definition 10.22), 
where T = T(n) and Q = Q(n) are as in (133), and e = |D|. Let M = 2”. 


1. On receipt of a question ((tg, tr), (xo, xrn)) a player first measures their share of |y) using the 
projective measurement for -ZOPAS for the typed question (tgo, xo) to obtain an outcome ag. The 
player then computes an answer, depending on tg, aQ and ty, x11, as follows: 


(a) Suppose tg = v € {A,B} and ty € {POINT,, ALINE,} (resp. DLINE, instead of ALINE,). 
Let a = dq if ag has length at most T, and let ap be the truncation of ag to its first T symbols 
otherwise. Let l4 < T be the length of Mo; and set 

ag = encr (ag, U), 
Next, the player computes the low-degree encoding Sal, of a using the low-degree encoding 


described in Section 3.4, in the same way as described in Section 10.4.2. The player then returns 
the restriction of Salt, to the axis-parallel line (resp. the diagonal line) specified by xy. 


148 


(b) If tg = ORACLE, for v € {A,B} the player computes questions x, = L°(xqg), as in Step 1 of 
DRC. The player parses ag as a pair (aĮa,ag). Let ay = aa if a, has length at most T, and 
let a, be the truncation of a, to its first T symbols otherwise. Let £4 < T be the length of a, 
and set 
a's = encp (ah, UMA), 
Define a} similarly. The player computes a PCP proof II = (91,...,95,C0,+-+,Cm') as de- 
scribed in the completeness case of Theorem 10.25 for the tuple (D,n, T, xa, xB), where the 
polynomials g1, g2 are low-degree encodings of a% and a}, respectively. 


i. If tr, € {POINT;, ALINE;} (resp. DLINE; instead of ALINE;) fori € {1,...,5}, the player 
returns the restriction of g; to the axis-parallel line (resp. diagonal line) of F7 specified by 
XII. 

ii. If ty € {POINT6, ALINE6} (resp. DLINE6 instead of ALINE6), the player returns the 
restriction of all the polynomials g1, . . . , g5, Co, - - - , Cm to the axis-parallel line of Fr (resp. 
the diagonal line) specified by xy. 


(c) In all other cases the player returns 0. 


The strategy .W4® is projective and consistent because .7°8“° is. To show that it has value 1, we first 
observe that by definition it satisfies all consistency checks. Moreover, the strategy passes all low-degree 
tests with certainty because it always returns restrictions of consistent polynomials. Finally, it also passes 
the game check with probability 1. This follows from the completeness statement of the PCP made in 
Theorem 10.25 and the fact that, if D accepts the input (1,x,,Xp,4,a,dp) in time at most T then it also 
accepts (n, Xa,Xp,4',, Ap ) in time at most T, where a’, and ap are obtained from a, and ag by truncating 
them to strings of length T if their lengths exceed T. 

To show that -^R is commuting, note that using the product structure of SAR every typed question pair 
with positive probability consists of a pair of questions ( (to, a, Xgo,a ), (to,B, XQ,B )) with positive probability 
for ORAC, together with an arbitrary pair ((tra, Xma), (tre, xn, )). Using that WOR*° is commuting 
and that the additional operations associated with ((trr, A, Xrm a ), (tr,B, Xm,B)) amount to classical post- 
processing it follows that 48 is commuting. 

This establishes the existence of a symmetric PCC strategy for VAR with value 1. By Lemma 6.18 it 
follows that there exists a symmetric PCC strategy for VAR = Detype(VA8), with value 1. 

O 


10.7 Soundness of the answer-reduced verifier 


Proof of the soundness part of Theorem 10.27. We first show the soundness for the typed verifier ye, 
Soundness for the detyped verifier VAR follows from Lemma 6.18, with a constant-factor loss using that 
the type set TAP for VAR has constant size. 

We proceed in two steps. Fix an index n > 1 and suppose that val’ (VAR) > 1— e for some e€ > 0. 
Observe that ORAC and SPCP both sample distributions that are invariant under permutation of the two 
players; therefore, the same holds for SAR. Moreover, the decider D4® treats both players symmetrically. 
Therefore, the game played by VAR is a symmetric game. Applying Lemma 5.7 it follows that there exists 
a symmetric projective strategy .Y = (|Y), M) for VAR with value greater than 1 — e. 

We use the following shorthand notation. A pair of questions to the players is ( (ta, x's), (tp, xg )) where 
for w € {A,B}, tw = (tow, tiw) and x}, = (Xow Xtw): When w is clear from context we omit it from 


the subscript. Fixing a w, whenever tg = ORACLE we introduce x, = L^ (xo) and xg = LE (x6) and 


149 


often write directly the player’s question as xg = (x4, xB). Whenever tg = v € {A,B} we slightly abuse 
notation and write the question as xg = (Xy,), explicitly including the type to clarify which player it 
points to. 

We denote the measurements used by both players in strategy .7 by {M(xQ)a™ }, where for the sake of 
clarity we have notationally separated the two parts xo and xry of the question and omitted explicit mention 
of the associated types tg and try (we include the type and write Mag when it is needed for clarity). 
First we show that the strategy Z is close to a strategy .”’ that performs “low-degree” measurements: upon 
receipt of a typed question (t,x) = ((tg, tm), (XQ, xm)) a player first performs a measurement depending 
on XQ to obtain a tuple of low-degree polynomials, and then returns evaluations of those polynomials on 
the subspaces (either axis-parallel or diagonal lines) specified by xry. This step of the argument uses the 
quantum soundness of the low-degree test performed in Steps 3 and 4 of Figure 14. Next, we “decode” 
this strategy to produce a strategy .7” for YORAC with a high value. This step makes use of the classical 
soundness of the underlying PCP shown in Section 10.4. The conclusion of the theorem then follows from 
the soundness of VORA (Theorem 9.3). We proceed with the details. 

We start by showing a sequence of claims that establish approximations implied by the assumption that 
SR succeeds with probability greater than 1 — e in the decision procedure implemented by the decider in 
Figure 14. 


Claim 10.28 (Global consistency check, Step 1). On average over questions (ta, Xa) = ((tg, tr), (xo, X11)) 
sampled from the marginal distribution of yar on the first player it holds that 


M(xg)*" @ I =: IQ M(xg)*". (136) 


Proof. First we observe that the condition ta = tg for the global consistency check, Step | in Figure 14, 
holds with constant probability over the choice of a pair of questions (ta, xa ), (tp, xp) sampled according 
to ugar- Thus <% must succeed in this test with probability 1 — O (e), conditioned on the test being executed: 
this is because each of SORA and SPCP have a constant probability of returning a pair of questions of the 
same type. 

Moreover, observe that conditioned on ta = tg a pair of questions ((ta,xa),(ta,XB)) ~ Mgar is 
such that xa = xp = Li, (z), where z is the sampler seed and Ly a the CL function of type ta associated 
with AR, The claim then follows directly from the test and the definition of approximate consistency 
(Definition 5.15). O 


Claim 10.29 (Input consistency check, Step 2). For all v € {A,B}, on average over question pairs 
(xa, xB) ~ us and z = (y1,..-,Y5,0,W) E€ F sampled uniformly at random, 


M(xa, xp)E TZ 9 I ~ IQ M(xp, 0) T” , (137) 
where as in Remark 10.26 we made the identification 1 +> A and 2 + B. Moreover, an analogous relation 
holds for operators acting on opposite sides of the tensor product. 


Proof. For w = A and fixed v € {A,B} there is a constant probability that tg, = ORACLE, tom = 2, 
and (tt, trm) = (POINT6, POINT+). Therefore, the input consistency check in Step 2 is executed with 
constant probability, and .Y must pass it with probability 1 — O(e), conditioned on the test being executed. 

Moreover, conditioned on tg, = ORACLE, tom = 2, and (trw, tnm) = (POINTs, POINT,), the 
distribution of (xow, xom) is ((Xa,XB), Xv) for (xa, xg) ~ Ms and the distribution of (X11, Xrm) is 
(Zz, yv) for a uniformly random z € Ee Eq. (137) then follows directly from the specification of the test 
and the definition of approximate consistency. The “moreover” part follows from the case w = B. O 


150 


Claim 10.30 (Input low degree test, Step 3). For each v € {A,B} and for each x in the support of the 
marginal of us on player v there exists a measurement (Gy y € PolyMeas(m, d, q) such that the following 
hold for some 6, = O(6,p(O(e),q,m,d,1)), where ô» is defined in Theorem 7.8. For allv € {A,B}, on 
average over x chosen from the marginal of ys on player v and Yy € Fy sampled uniformly at random, 


M(x, De Teie Q I xs I Q C a= , (138) 
IQ M(x, v) ENTA Si in L Ql, (139) 
G 8I Sa 18 G7, (140) 


where we used the notation evaly, (8) = 8(yv) for the evaluation map. 


Proof. Fix v € {A,B}. For any w € {A,B} there is a constant probability that tow = tom = V 
and (trw, trm) = (POINT,, ALINE,) (resp. DLINE,). Therefore, the input low degree test in Step 3 of 
Figure 14 is executed with constant probability, and .Y must pass it with probability 1 — O(e), conditioned 
on the test being executed. 

Observe that by definition the distribution of (xry A, Xrg,g ) conditioned on tow = tom = vV, uniformly 
random xo = (Xy,v), and (tm,w, tmm) = (POINT,, ALINE,) (resp. DLINE,), where w € {A,B} is 
uniformly random, is exactly the distribution of questions in the game 6"? described in Section 7.1.1, 
parametrized by Idparams = (q,m,d,1). 

For every v € {A,B} and question x = L°(z) in the support of the marginal distribution of jig on 
player v let £x be the probability that .Y is rejected in Step 3, conditioned on the test being executed and on 
average over W € {A,B}. Then E[€x] = O(€), where the expectation is taken over a uniformly random 
v € {A,B} and x = L°(z) for uniformly random z. 

By definition it follows that the strategy SAR conditioned on the first part of the players’ questions being 
tow = tom = v and xoa = Xop = X is a projective strategy that succeeds with probability 1 — €x, in 
the low-degree test Ddparams executed in Item 3. 

We may thus apply Theorem 7.8 to obtain {Gy’"} € PolyMeas(m, d, q) such that (138), (139) and (140) 
each hold with approximation error O(d.p(€x,0,q,m,d,1)). Using that for fixed g,m,d the function € > 
otp(€,q,m,d,1) is concave, the claim follows from Jensen’s inequality. O 


Claim 10.31 (Proof encoding checks, Step 4). For each xo = (xa, xp) in the support of us there exist 
measurements Coa Jat € PolyMeas(m,d,q) for each i € {3,4,5} and 


es JE PolyMeas(m',d,q,m' + 6) 


such that the following hold for some 
62 = O(5,p(O(E), q,m,d,1) + ôo (O(£),qg,m',d, m + 6)) . 


First, for alli € {3,4,5}, on average over (xa, xg) ~ us and z = (y1,...,Y5,0,W) of type POINT6 
sampled uniformly at random, 


1® Mits taje Ye M(xa, Xn )PON™ @ I. (141) 


151 


Second, for alli € {3,4,5} and on average over (X4,Xp) ~ us and yi € Fy sampled uniformly at 
random, 


Ponty Yigin ~, 18 Gate) i (142) 


[evaly,(-)=a] ” 


Grey QI ~, IS elas (143) 


M(xa, XB 


Third, for alli € {1,...,5} andj € {0,...,m'}, on average over (xa, xp) ~ us and z € Re sampled 
uniformly at random, 


POINT6,Z XA xX 

M(xa, XB) a. pa Bor- Bm 8 i =’ I 8 Jeane) v )=(41,--45,/Bor-/Brn!)| 7 (aa) 
KAX XA; X 

I a @1 x5, 1@ J! o a (145) 


Moreover, analogous equations to (141), (142) and (144) hold with operators acting on opposite sides of 
the tensor product. Here, recall the definition of PolyMeas(.-,-,-) from Definition 7.7. 


Proof. The proof of the first item is similar to the proof of Claim 10.29, and we omit it. 

The proof of the second and third items is similar to the proof of Claim 10.30, and we include more de- 
tails. Fix ani € {3,4,5}. For any w € {A,B} there is a constant probability that tg» = tam = ORACLE 
and (trw,tr,m) = (POINT;, ALINE;) (or DLINE; instead of ALINE;), in which case the individual low- 
degree test in Step 4b is executed. Therefore, .” must succeed in that part of the test with probability 
1 — O(e) conditioned on the test being executed. 

Furthermore, for fixed i € {3,4,5} and uniformly random w € {A,B}, conditioned on the test being 
executed for that i and w the distribution of (x14, ¥n,s ) is exactly the distribution of questions in the game 
6P described in Section 7.1.1, parametrized by Idparams = (q,m,d,1). 

For every i € {3,4,5} and x = (xa, xp) in the support of jig let £y; be the probability that Z is 
rejected in Step 4b, conditioned on the test being executed for that i and on average over w € {A,B}. Then 
for each i, E[é,,;] = O(€), where the expectation is taken over a uniformly random x ~ ys. 

By definition of the individual low-degree test it follows from Theorem 7.8 that for every x = (xa, Xp) 
in the support of us andi € {3,4,5} there is a measurement ie” yy € PolyMeas(m,d,q) such that 
on average over yj € Fr of type POINT; sampled uniformly at random, Eq. (142) and (143) both hold 
with approximation error O(d.p(€x,i, q, m, d, 1)). Eq (142) and (143) follow using the concavity of d,p as a 
function of €. 

Finally we consider the simultaneous low-degree test, Step 4c. Here as well, using that there is a con- 
stant probability that tow = tow = ORACLE and (trm,w, tmm) = (POINT6, ALINE6) (resp. DLINE6) 
it follows that 48 must succeed in that part of the test with probability 1 — Of): Using a similar ar- 
gument as before it follows from Theorem 7.8 ee time for parameters (q, m’, d, m’ + 6)) that for every 
(xa, xg) there is a family of measurements {J} ai A ie T. € PolyMeas(m',d,q,m' + 6) such that on 


average over Z € F sampled uniformly at Barei Ea (144) and (145) both hold with approximation 
error O(d.p(O(e),q,m',d,m' + 6)). O 


The families of measurements {Gz2"} and { i. bance ,} for xg in the support of us andi € 


1, Se 5}, whose existence follows from Claim 10.30 and Claim 10.31 have outcomes that are low (in- 
dividual) degree polynomials: for the first family, individual degree d polynomials g : Fy — F;, and for 


the second, tuples of individual degree d polynomials fj, cj : Re — F}. Recall that m’ = 5m +5 + s and 


152 


that an element z € Fr is written as a triple (y,0, w) with x = (y1,...,Y5) € F”, 0€ F; and w € Fi. 
The following claim, whose proof is based on Lemma 5.26, shows that we can reduce to a situation where 
the polynomials fi, ..., fs returned by J are such that for each i € {1,...,5}, f; only depends on the yj, 


and not on the entire variable z. 


Claim 10.32. For all (x4, xp) in the support of us and individual degree d polynomials g1,..., g5 : Fy > 
F; and co,..., Cw : F — F; define 


XA,XB — rxalrxg 2r (XaxB)3 A(X 2B) 4n (xax )S (xax) a (XA%B),9 alax) Anla) axra] 
N85 CO Ln! = Gji Go Gg, Gey Gg; Corl nt G85 Ge, Gg, Gy Get 4 


where the outcomes (fi, ..., fs) of the J operator in the middle have been marginalized over. Then there is 
a 


S= O( (ole g,m',d,m' +6)1/2 + way) 


such that 
63 > max fe, 1, ba} 


and on average over (xa, Xg) ~ us and z € F sampled uniformly at random, 
XA,XB i XA,XB 
ANfevale(-)=(a,p)| © 1 ~8 18 Jieval:()=(1,6)] ` U47) 
Moreover, a similar equation holds with the operators acting on opposite sides of the tensor product. 


Proof. We apply Lemma 5.26 with the following setting of parameters. The number of sets of functions k 
is set to 6. The question set V is set to the support of us, and the distribution y on it is the distribution ys. 
The sets G; for i € {1,...,5} consist of individual degree d polynomials over Fe that depend only on the 
i-th block of m variables. The set Ge consists of (m' + 1)-tuples of individual degree d polynomials over 
F. 
j We first verify the assumption on the sets of functions. Since all polynomials have individual degree at 
most d (and therefore total degree at most m'd), by Lemma 3.20 the parameter ¢ in Lemma 5.26 can be set 
to m'd/q. 
(xa/XB) } 


The family of measurements Cee ,} in Lemma 5.26 is the family of measurements {J T 


here, where we set g; = f; fori € {1,...,5} and ge = (co,...,Cm’). The measurements {Gg"} in 
Lemma 5.26 are {Gza"} fori € {1,2}, ie) fori € {3,4,5}, and (oe) for i = 6. To ensure 
that all polynomials are defined over the same range, we treat g : Fy — IF, that is an outcome of some 
IGF} as a polynomial g” : pr — IF,, where the role of the m variables of g is taken by the i-th block of 
m variables of g’. 

We verify that assumption (40) in the lemma holds with an error 5 = O((max{e,61,62})!/). For 
i € {1,2} the assumption follows by combining (137) and (139) with (144) and Fact 5.24. Fori € {3,4,5} 
we use (141) instead of (137) and (142) instead of (139). Finally, for i = 6 we use (145) and Fact 5.24. In 
these derivations, we use Fact 5.23, the triangle inequality for “œ=”. 

The conclusion follows from Lemma 5.26, with an error 63 = O((6 + m'd/q)!/*). Observing that 
€,01,62 = O(d,p(e,g,m',d,m' + 6)), as can be verified from the definition of ôL given in Theorem 7.8, 
and therefore ô = O(6.p(e,q,m',d, m' + 6)!/2), the claimed bound 


ôs = O( (ole qm, dm +6)? + way) 


153 


follows. 
O 


At this point, we have constructed measurements G and A that return low-degree polynomials in a 
similar way as is expected from the honest strategy in VAR, as described in the proof of Theorem 10.27. 
These measurements can be used to specify a new strategy .7’ for the game VAR as follows. The shared 
state remains the state |) used in .Y%. For w € {A,B}, upon reception of a question (tw, Xw) player w 
performs the following. If tw = (tow, trw) is such that tQ,w = ORACLE, the player measures their share 
of |) using the measurement A*2 defined in Claim 10.32 to obtain a tuple (g1, .. ., g5, Co, -> -, Cm) of 
polynomials. The player then answers exactly as in the strategy described in the “completeness” part of the 
proof of Theorem 10.27. Similarly, if tg, = V € {A,B} the player first measures their share of |) using 
the measurement G*2’? from Lemma 10.30 to obtain a polynomial g as outcome; the player then answers 
according to the same honest strategy. 


Lemma 10.33. The strategy Z” succeeds with probability 1 — O(5;/ 2) in the game VAR, 


Proof. First we establish useful consistency relations. By combining Equation (147) and Equation (144) 
and applying Fact 5.24 we obtain that for all i € {1,...,5}, on average over (xa, xg) ~ fs and z € ke 
sampled uniformly at random, 

M(xa,xp)ep? I 5 TB AG a (148) 
and a similar equation holds with the operators acting on opposite sides of the tensor product. Next, combin- 


ing (148) together with Equation (137) and Equation (138) it follows that for each v € {A,B}, on average 
over (xa, Xg) ~ us and z = (y1,...,Y5,0,W) € Fy" sampled uniformly at random, 


Xv,U Aa (x XB) 

Chevalys(-)=ao] © T 65 18 Afevals( Jeza] ' aa) 
From the Schwartz-Zippel lemma (Lemma 3.20) it follows that the probability that any two distinct individ- 
ual degree d polynomials gy (an outcome of G*»?) and g/, (an outcome of A(*4*8)) agree at a uniformly 
random point yy € Fy is at most m'd/q. It thus follows from (149) that for all v € {A,B}, on average 


over (x4, XB) ~ us and yy € FM sampled uniformly at random, 
Gona T Hs pmid/g TOAS, (150) 


We now show that .Y’ is accepted by VAR with high probability. We bound the probability of succeeding in 
each subtest. 

First note that the strategy is accepted in Step 1 of Figure 14. For the G measurements, consistency 
follows from (140). For the A measurements, note first that by (147) and (145) it follows that on average 
over (xA, XB) ~ Ms, 


XA/XB RI XA,XB 
Nrevale(:)=(«.f)] 8 T S% 18 Afevai.()=(a6)] (151) 


Using that all outcomes of A*4”*® are individual degree d polynomials and the Schwartz-Zippel lemma 
(Lemma 3.20) it follows that whenever a measurement of A*4*8 @ A*4”® returns distinct outcomes, the 
outcomes take a different value at a uniformly random z with probability at least 1 — m'd/q. It then follows 
from (151) and the fact that 63 > m'd/q by definition that the strategy Z’ is accepted in Step 1 with 
probability 1 — O(63). 


154 


Next, the strategy is also accepted in the consistency check performed in item 2 due to (150), and the 
consistency check in item 4(a) for the same reasons as for item 1. Finally, for the low-degree tests performed 
in item 3 and items 4(b) and 4(c), the strategy succeeds due to consistency and the fact that, as long as both 
players obtain the same polynomial outcomes, they pass the low-degree tests with probability 1. 

It remains to analyze the strategy’s success probability in Step 5, the game check. Recall that this 
check is nontrivial only when tg, = ORACLE and trw = POINTs. By assumption the original strategy 
JZ succeeds with probability 1 — O(e) in the game check. Thus, it will suffice to show that the answer 
distribution from .7’ is close to the answer distribtion from .7 whenever tow = ORACLE and tnw = 
POINTg for either or both players w € {A,B}. For these question types, the measurement operator used by 


SF is M(xa, Orie and the operator used by 2” is a ie (ap): BY converting Equation (148) and 
Equation (151) to © bounds using Fact 5.18, and then applying the triangle inequality (Fact 5.22) we obtain 


that on average over (X4,Xp) ~ Us, 


KA aS POINT6,Z 
Arevale(:}=(a,p)] 8 T Sas M(xa, XB)(a,p) 8 I. 
Thus, restricting to the game check, Z and .”’ are 63-close in the sense of Definition 5.17. They both 
use the same state, and moreover, Z is projective. Therefore, by Lemma 5.28, this implies that .7’ must 
succeed in the game check with probability 1 — e— o(a} Asl O(5;/ 2), Thus, we have shown that .7’ 
succeeds in all subtests of the game with probability 1 — o(a? 2), and thus the lemma follows. O 
We now complete the proof by a reduction to the game wes from the strategy Z” we construct a 
symmetric strategy S” = (|p), A) for VORAS by “decoding” the low-degree measurements G and A. The 
state |p} in 7” is identical to the state used in .Y’ (which is identical to the state used in .7). To begin, we 
define a decoding map A(-), which takes in a polynomial g : EF — F; and outputs a string in {0, 1}*. This 
map is computed as follows: 


e First, compute a = Dec(g) € {0,1}™, where Dec is the (boolean) decoding of the low-degree code 
defined in Section 3.4. 


e If there exists an dprefix € {0,1}* of length l4 < T such that a = encr (Aprefix, UM/2—la) | then 
A(g) = prefix: Otherwise, A(g) is allowed to be arbitrary. 


We can now define the “decoded” measurements {A**’’} and { A*4”® } as follows: 


Ar” = GAC) ý Aahas ~ aCe | ` Hea 


Lemma 10.34. The strategy Z" succeeds with probability 1 — O(5;/ 2) in the game VOR*, 


Proof. We consider the different subtests executed by PORAC (see Figure 12). We start with Step 2, the 
consistency checks. Success in the first check, Step 2a, follows from the success of .~’ in the global 
consistency check, Step 1 of DAR, the definition (152), and the fact that conditioned on tg, = top = 
ORACLE, the distribution of (xoa, xos) in VAR is the same as the distribution of (xa, xg) in VOR, 
conditioned on ta = tg = ORACLE. Similarly, success in the second check, Step 2b, follows from success 
of Z” in the input consistency check, Step 2 of DAR, 

Next we consider the game check of Dorac, Step 1. To analyze the success probability of 7” we use 
that .7’ succeeds in the game check of DAR, Step 5, and the soundness of the PCP, as shown in Theo- 
rem 10.25. Let Psouna be as in Theorem 10.25. 


155 


Let w € {A,B}, (xa, xg) be in the support of us, and II = (g1, .. ., 85, C0,- - -, Cm’) an outcome of 
A*4”8 such that conditioned on that outcome being obtained by player w in the game check of DAR, Mar 
accepts the pair of inputs (D, n, T, Q, y, xa, xp ) and (z, aw) with probability at least Pounga over the choice 
of a uniformly random z € Re and ay = eval, (II). 

For any such IT, the soundness statement of Theorem 10.25 states that there exist aa,ag € {0, 1%} such 
that D(n, xa, Xg, 4a, ap) = 1 and furthermore a, = A(gy) for v € {A,B}. It follows that for any proof II 
returned by ae y which is accepted with probability greater than Psouna in the game check of DA it 
holds that D(n, xa, xg, A(ga),A(gp)) = 1. Using this observation we evaluate the probability qg that the 


strategy Y” succeeds in the game check of Y°*°. Let qy be the probability that 7’ succeeds in the game 
check of DAR, 


q= E 3 (AG @ Ilp) 

(xa xB) ~US aa,ag:D(n, Xa, XB, Aa ap)=1 

- E 3 (pla? @ Ip) 
(XA XB) SHS T1-D(n,xq,x8,0(81),A(g2))=1 

> E J (Aga @ Ilp): Pr [Marz eval: (1)) = 1] 
(xab) SHS TD (n xa xp Alg), A(82))=1 a 

=q- E E (plA**” @ Ip) Pr [Mar(z,evalz(IT)) = 1] 

(xax) ~s I:D(n,xa,xg,A (g1), A(g2))=0 zE 


= qe a (1 = qe) * Psound - 
Rearranging terms, 
1 1 
"~ fg — Psound —1— die 1g , (153) 
E> I= Psound I= Psound 


Altogether, using Lemma 10.33 we have shown that 7” is accepted in each subtest performed by pom 
with probability at least 1 — OG” 2), Since every subtest occurs with constant probability, the lemma 
follows. O 


To conclude the proof of the soundness, we appeal to the soundness statement for yo given in 
Theorem 9.3. Now we establish the form of the error function 6 as stated in the theorem. 
Let a’, b' be the universal constants from Theorem 7.8 such that 


di(Eqemydyt) = al(dmt)*(eF +g” 428M), 


By our choice of PCP parameters in Definition 10.22, we have that m < s, so therefore we can upper 
bound the product dm’ (m’ + 6) by cys? where c > 1 is a universal constant. Furthermore, we see from 
Definition 10.22 that 2%” > q > s(7"'+32)/"" so we have 


dip (E, q, m', d, m' Je 6) < a! (cy)" (sr r el! aD got’ . q”) 
< 2a' (ey) (887 : eh =" 30’ À ge an) 
< aly" (s™ ; eb” uli s77") (154) 


where we set a” = max{2a'c”,3a’} and b = b'. From Proposition 10.20 and Eq. (134) we have that s is 
at most poly((An)#,@), so there exists a universal constant C > 0 such that for all n > 2, 


s < C(A- o-n). 


156 


Plugging this bound into Equation (154) and using the choice of q, d, m’ from Definition 10.22, we get that 
there exist universal constants a > 1 and 0 < b < 1 such that 


Idx 1/4 
ô? =O (ee qm’ dm +6)? + =~) ) 
<a ((A- o-n)" -è + (A-on) Y). 


To show the entanglement bound, we observe that the strategy .7” for YRC constructed above uses the 
same entangled state |y) as the strategy .Y for VAP that we started with; the claimed bound follows. O 


157 


11 Parallel Repetition 


In each of the transformations on verifiers presented so far (introspection, oracularization, and answer re- 
duction), the soundness gap of the resulting verifier is slightly degraded: while the completeness property 
(i.e. the property of having a PCC strategy with success probability 1) is preserved, if the starting game V, 
has value at most 1 — e, the resulting game V/, has value at most 1 — ô(e, n) for some function ô. In order to 
apply the compression procedure recursively we need a way to restore the soundness gap after a sequence 
of transformations. We accomplish this using (a modification of) the technique of parallel repetition. This 
is a transformation of a two-player game 6 into another two-player game 6* where the verifier samples k 
independent pairs (x1, y1), . - -, (Xk, Yk) from the question distribution pz of 6, sends the tuple (x1,..., Xx) 
to the first player to obtain an answer tuple (a1, . . . , ag), and sends the tuple (41, . . . , Yg) to the second player 
to obtain an answer tuple (b;,...,b;). The players win if and only if D(x;, yi, 4i, bi) = 1 for all i. 

Intuitively, if the value of 6 is v < 1, then one would expect the value of 6* to decay exponen- 
tially with the number of repetitions k. It is not true in general that the value of 6* is v, but exponential 
decay bounds on the (tensor product) value of parallel-repeated games are known for specific classes of 
games [JPY14, DSV15, BVY17]. In particular, it was shown in [BV Y17] that the class of anchored games 
satisfies exponential-decay parallel repetition, and furthermore every game can be efficiently transformed 
into an equivalent anchored game. Put together, this gives a general soundness amplification procedure 
called “anchored parallel repetition,’ which we use in our compression procedure to reset the soundness gap 
to a fixed constant. 

We describe the “anchoring” transformation of a game 6 in more detail. The anchoring of 6 is another 
game 6, where the verifier behaves as follows: it samples a question pair (x,y) from the question distri- 
bution 4 of 6, and then for each player independently changes their question to a special symbol -L (not 
part of the original game 6) with probability 1/2 to obtain new questions (x’, y’). The first player receives 
question x’ and responds with answer a; the second player receives question y’ and responds with answer 
b. If neither player receives the question |, then they win if and only if D(x’, y’,a,b) = 1 where D is 
the decision predicate of 6. Otherwise, the players win if and only if the players receiving the question L 
output a specific, fixed answer (e.g. 0). 

It is easy to see that val* (6, ) = ł + ;val*(6). In particular, we have that val*(G |) = 1 if and only 
if val*(G) = 1. Bavarian, Vidick and Yuen proved the following bound on the entanglement requirements 
of 6K, which is the k-fold repetition of 6, . 


Theorem 11.1 (Parallel repetition of anchored games, Theorem 6.1 of [BVY17]). There exists a universal 
constant c > 0 such that for all two-player games © and for all positive integers k, for all 0 < e < 1, for 


all p satisfying 
4 ae JE ce k 
P E P S 


where s denotes the bit-length of the players’ answers in the game ©, we have 
&(G* p) > £(6,1— e). 


We note that a corollary of Theorem 11.1 is that if val*(6) < 1—e, then val* (6%) < p for p = 


1 exp (-5+). This is because if val* (65) were larger than p, there would be a finite upper bound on 


&(6* , p), but on the other hand £ (6,1 — £) is by definition infinite, a contradiction. 
In this section, we present the anchoring and parallel repetition transformations on normal form verifiers; 
starting with a normal form verifier V = (S,D), we first apply the anchoring transformation to obtain a 


158 


normal form verifier VANC# = (SANCH, DANCH), then apply parallel repetition to obtain a normal form 


verifier VREP = (SRE? DREP), The main theorem of the section, Theorem 11.4, establishes the properties 
of the verifier VRF?, 


Remark 11.2. The parallel repetition theorems of [DSV15, Yuel6] are also applicable for the purpose of 
amplification of the soundness gap, but are not sufficient for us. This is because the (proofs of the) parallel 
repetition theorems of [DSV15, Yue16] only imply that E (64, v) > log & (6,1 — e€), which is not sufficient 
for the recursive compression application; we need a larger entanglement lower bound on &¥. The weaker 
entanglement bound is due to the use of the “quantum correlated sampling lemma” of [DSV15]; it is an 
open question whether there is an improved analysis that obtains a better entanglement lower bound. 


11.1 The anchoring transformation 


We present the anchoring transformation on normal form verifiers V = (S,D), which produces another 
normal form verifier VANC# = (SANCH, DANCH), We first define a typed verifier PANH = (SANCH, DANCH), 
and then detype VAN“ using Lemma 6.18 to obtain VANH, 

Define the type set T4NC# = {GAME, ANCHOR} and type graph GAN‘ to be the complete graph over 
TAN along with self-loops at each vertex. The Turing machine SAN“! is defined as follows: 


1. On input (n, DIMENSION), it returns S (n, DIMENSION). 


2. On input (n, w, MARGINAL, j,Z,t), it returns S(n,w, MARGINAL, j) if t = GAME, and otherwise 
returns the binary representation of the zero vector in F3 where s = S(n, DIMENSION). 


3. On input (n, w, LINEAR, j,u,y,t), it returns S(n,w, LINEAR, j,u,y) if t = GAME, and otherwise 
returns the binary representation of the zero vector in F3 where s = S(n, DIMENSION). 


4. On input (n, w, FACTOR, j, u, t), it returns S(n, w, FACTOR, j, u) if t = GAME, and otherwise returns 
the binary representation of the zero vector in F3 where s = S(n, DIMENSION). 


Define the Turing machine DANCH that, on input (n, ta, Xa, tB, Xg, 4a, Ap ), if either ta or tg is equal to 
the type ANCHOR, then the decider accepts as long as the players receiving the ANCHOR-type answer 
with 0 (a player who receives the GAME type can respond with any answer). Otherwise, it accepts only if 
D(n,X,4,Xp,4,4,4p) accepts. Finally, define the Turing machines (S4N“", DAN“") to be the result of the 
detyping transformation Detype(V4N“") from Lemma 6.18 on the pair YANCH = (GANCH, DANCH), 

Note that this transformation is well-defined for any pair of Turing machines (S,D), even those that 
don’t correspond to normal form verifiers! This transformation can be performed in time that is polynomial 
in the length of the descriptions of S and D, and always outputs a pair of Turing machines (SAN°#, DAN), 
However, the following proposition establishes that if V = (S,D) is furthermore a normal form verifier, 
then so is VANCH = (SANCH, DANCH) with certain completeness, soundness, and complexity properties. 


Proposition 11.3. If V = (S,D) is a normal form verifier, then VANE = (SANH, DANCE) is a normal 
form verifier satisfying the following for all n € N. 


1. (Completeness) If there is a value-1 PCC strategy for Vy, then there is a value-1 PCC strategy for 
pase 


2. (Soundness) For all e > 0, we have €(VAN",1—e€) > & (Va, 1 — (4: 167)e). 


159 


3. (Complexity) The time complexities of the verifier VAN" satisfy 


TIME sancu(1) = poly(TIMEs(n)) , 
TIME pancu (n) = poly(TIMEp(n)) . 


Furthermore the number of levels of S\N" is £ + 2 and the dimension is s(n) + 8, where the number 
of levels and dimension of sampler S is £ and s(n) respectively. 


4. (Efficient computability) The descriptions of S®~®™ and DAN“ can be computed in polynomial time 
from the descriptions of S and D, respectively. In particular, the sampler SSX" only depends on the 
sampler S. 


Proof. If S is an ¢-level sampler with dimension s(n), then by construction SAN" is a (TANCH, GANCH)- 
typed sampler, with field size q(n) = 2 and the same dimension s(n). It has the following CL functions. Fix 


an integer n € IN. Let V = FS”) denote the ambient space of S on index n. Let LÂ, LB : V — V denote 
the CL functions of S on index n. For w € {A,B} the associated CL functions { LAN°"} of SANCH are 


ANCH w _ L” ift = GAME, 
i 0 if t = ANCHOR. 


Intuitively, when the type t sampled for player w is GAME, then they receive a question L” (z) as they would 
according to S. Otherwise if t = ANCHOR, then their question is the zero string. Thus if L” is an @-level 
CL function, then Lee % is also an -level CL function, and Viel he is a 0-level CL function. 

We analyze the completeness, soundness, and complexity properties of the typed verifier VANH, the 
corresponding properties of the detyped verifier VAN! follow from Lemma 6.18 and the fact that the type 
set J ANCH has size 2. 

Fix an index n € N. For the completeness property, let .Y be a value-1 PCC strategy for V,,. We define 
a value-1 PCC strategy -ZANE for pence, whenever a player receives the ANCHOR type as a question 
type, they perform the trivial measurement (i.e. measure the identity operator), and output 0. Otherwise, the 
player performs the same measurement as in .”. This is clearly value-1 and PCC. Item 1 follows from this 
and Lemma 6.18. 

For the soundness property, we observe that if a strategy WAN has value 1 — ¢ in PANC, then 

3, P 
1— e= 4 + 4 
where p is the value of “4X in the game V,,, and 1/4 is the probability that neither player receives the 
question type ANCHOR; this follows from the distribution associated with the typed sampler SAN Solving 
for p, this implies that £ (VANH, 1 — e) > &(Vn,1—4e). The soundness property (Item 2) then follows 
from Lemma 6.18. 
Items 3 and 4 are straightforward and also follow from Lemma 6.18. O 


11.2 The parallel repetition transformation 


Starting with the anchored verifier VANH = (SANCH,DANCH), we then perform the parallel repetition 
transformation. 

Fix integers A,t € N, and let k(n) = (An)(1+)" where c’ > 0 is the universal constant such that 
TIME paxcu (n) < c/(TIMEp(n))° for all n > 1. Define the Turing machine SR®? (depending on A, T) as 
follows. 


160 


1. On input (n, DIMENSION), it computes s’ = SANC#(n, DIMENSION) and k = k(n), interprets s’ as a 
positive integer and outputs k - s’. 


2. On input (n, w, MARGINAL, j, Z), it computes s’ = SANCH(n, DIMENSION) and k = k(n), parses 
z = (z,...,Z) € (IFS )*, and then outputs 


(SAN (n, w, MARGINAL, j,21),---, SN (n, w, MARGINAL, j, ze) ) : 


3. On input (n, w, LINEAR, j, u, y), it computes s! = SANCH(n, DIMENSION) and k = k(n), parses 
u = (u1, ..- Uk) Y = (Y1, - - Ye) € (FS), and then outputs 


ANCH t ANCH 7 
(s (n, w, LINEAR, j,U1,Y1),..-,S (n, w, LINEAR, j, ue Ye)) F 


4. On input (n, w, FACTOR, j, u), it computes s! = SAN°#(n, DIMENSION) and k = k(n), parses u = 
(uy,...,U) € (IFS )*, and then outputs 


(SAH(n, w, FACTOR, j, u1), . - ., Sô®N™ (n, w, FACTOR, j, u) ) : 


Intuitively, the Turing machine D®*? behaves as follows. On input (n, x,y,a, b), Parse x,y,a,b as k(n)- 
tuples of questions and answers, respectively, and accept if and only if DAN! (n, x;, Yi, Ai, bi) accepts for all 
i € {1,...,k(n)}. One potential issue is that, depending on how this parsing is implemented, the time com- 
plexity of the decider could be unbounded (for example, if x is not properly formatted as a k(n)-tuple, then 
DRE could take time that grows with the length of x, which could be much greater than TIME paxcu (71)). 
To avoid this issue, the decider DR? checks if any component of the tuples x, y, a,b have length larger than 
(An)* bits, and if so then it rejects (this includes the case that x, y, a, b are not properly formatted as k(n)- 
tuples). Thus, as long as the time complexity TIMES" (n) is at most (An)™, then the strings x, y, a, b 
never need to be longer than k(n) - (An)? = (An)(1+2¢T pits long in order to be accepted by the decider. 

Note again that this transformation is well-defined for all Turing machines (SANS, DAN"). The next 
theorem shows that, if (SAN°#, DANC#) is the anchoring of a normal form verifier V = (S,D), then 
yReP — (SREP DREP) is also a normal form verifier with certain completeness, soundness, and complexity 
properties. This is the main result of the section. 


Theorem 11.4 (Anchored parallel repetition of normal form verifiers). There exist universal constants 
c,c! > Oanda polynomial-time Turing machine ComputeRepeatedVerifier that given input a tuple (V, À, T) 
where V = (S, D) is a pair of Turing machines and À, T are positive integers, outputs a pair of Turing ma- 
chines VRE = (SREP, DREP) such that the following holds. If V = (S, D) is an ¢-level normal form verifier, 
then the output V8"? is a normal form verifier satisfying, for all n € N, letting k(n) = (An) 0°97, 


1. (Completeness) If V, has a value-1 PCC strategy and TIMEp(n) < (An)", then VR? has a value-1 
PCC strategy. 


2. (Soundness) For all e > 0, for all 
4 / 
p > -exp (-ce” k(n) /(An)*° ) , 
€ 
we have E (VRP, p) > (Vn, 1—8). 


161 


3. (Complexity) The repeated verifier VR"? is (£ + 2)-level and has time complexities 
TIME sree (n) = O(k(n) - TIMEs(n)) , 
TIME pres (n) = O(k(n) - max (TIMED(n), (An)*) ) l 


Furthermore, the repeated sampler SP® only depends on S and the parameters À, T. 


Proof. The Turing machine ComputeRepeatedVerifier, given input (V, AÀ, T), first computes the descrip- 
tion of the Turing machines (GANCH, ĤANCH) and then their detyping (SANH, DANCH) as described in 
Section 11.1. Then, it computes the pair of Turing machines (SR#’, DR#) as described above. Clearly 
ComputeRepeatedVerifier runs in polynomial time. 

Suppose that S is an ¢-level sampler with dimension s(n). Then by Proposition 11.3, the number of 
levels of SANC# is #” = £+ 2, and its dimension is s'(n) = s(n) + 8. Then by construction the Turing 
machine SR"? is a sampler with the following properties. It has field size q(n) = 2 and it has dimension 
sRPE? (n) = k(n)s' (n). We treat the ambient space V8? of SP on index n as the k(n)-fold direct sum of 
the ambient space VANC# of SANCH on index n. For all integers n € IN, w € {A,B}, the CL functions 
LREP,w , VREP _, VREP of SREP are as follows: 


k(n) 
LREP, w = B L® F 
i=1 


where LA, LB : VANCH 5 VANCH are the CL functions of the sampler SAN“ on index n. The CL functions 
LRE? w are ¢-level; the j-th factor spaces v ” of LREP Y are defined as 


k(n) 
Vi = B Vi 
i=1 
for all u = (u1,...,Ug(n)) Where Vii is the j-th factor space of L” with prefix u; € VAN, 

The number of levels of SPF? is the same as SANH, which is # = £ +2. 

Thus, assuming that V = (S, D) is a normal form verifier, we obtain that VREP = (SR!?, DREP) is a 
normal form verifier, and thus defines an infinite family of games (VR®?) pew. 

Item 1 follows from the following straightforward observation: if Y% = (p,A,B) is a value-1 PCC 
strategy for V, and TIMEp(n) < (An)*, then it must be that the question and answer lengths in the 
strategy Z are at most (An)*. Now consider the strategy where the players share k(n) copies of |y}, and 
for the i-th instance of the game VN", the players use strategy .Y on the i-th copy of |) (and performing 
the identity measurement whenever they receive the ANCHOR type). The total length of the questions and 
answers in the repeated game are at most k(n) - (An)*. It is straightforward to check that this strategy has 
value 1 and is PCC. 

To show Item 2 we apply Theorem 11.1 and Proposition 11.3. The exponential decay bound on the 
value of VR? presented in Theorem 11.1 depends on the answer length of the players in the game VAN, 
By Definition 5.33, this answer length is at most TIMEpaxcu(1) = c/(TIMEp(n))° < (An)™ for some 
universal constant c’ > 0 so the claimed bound follows. 

Item 3 follows from the fact that computing the direct sum of k(n) CL functions and factor spaces of 
the “single-copy” sampler SAN" requires k(n) times the complexity of the “single-copy” sampler, and the 
complexity of SAN" follows from Proposition 11.3. The time complexity of the decider follows from the 


162 


repeated decider having to run k(n) instances of the decider DAN“ and take the logical AND of the k(n) 
decider outputs, as well as the fact that the decider rejects (and thus halts) in the case that the question and 
answer tuples are improperly formatted. The complexity of DANSE follows from Proposition 11.3 which in 
turn runs an instance of the original decider D. Since the CL functions of the sampler SAN“ are (£ + 2)- 
level (by Proposition 11.3), and taking the direct sum of CL functions does not increase the number of levels 
(by Lemma 4.8), the CL functions LRE®® are (¢ + 2)-level. 


The “Furthermore” part of the theorem can be easily verified by inspecting the construction of the 
samplers SRE? and SANCH, O 


163 


12 Gap-Preserving Compression 


We combine the transformations from the previous sections (the introspection games, answer reduction, 
and parallel repetition) to obtain our main technical result, a gap-preserving compression theorem (Theo- 
rem 12.1) for normal form verifiers. 

Recall from Definition 5.32 that a verifier V = (S,D) is A-bounded if the description length IV] = 
max{|S|,|D|} is at most A and the time complexity bounds TIMEs (n) and TIMEp (n) are at most n^ for 
all n > 2. The compression theorem states the existence of an efficient “compression procedure” Compress 
that achieves the following. Given as input a A-bounded verifier V and the parameter A, the procedure 
returns a “compressed verifier” Y©OM?® such that the n-th game V©™?® simulates (in a sense to be made 
precise shortly) the N-th game Vy for N = 2”, and furthermore both the sampler time complexity and the 
decider time complexity of VÇOMPR can be exponentially smaller than those of Vy, which can be as large 
as N*. We note that the time complexity bounds of the compressed verifier hold without the assumption 
that the input verifier is A-bounded. The completeness and soundness bounds of the theorem require the 
A-boundedness of the input verifier V. 


Theorem 12.1 (Compression theorem). There exists a universal constant Co > 0 and a polynomial-time 
Turing machine Compress that takes as input a tuple (V,A) where V = (S,D) is a pair of Turing machines 
and À > 0 is an integer, and outputs a 9-level normal form verifier YOOMPR = (SCOMPR, DOOMPR) | The time 
complexities of SCOMPR, DOOMR satisfy 


TIME scompr (n) = poly(n, A), 
TIME pcomer (n) = poly(n, À). 


The description of S©°M?® is independent of V and is computable from the binary representation of À in 


time polylog(A). If furthermore V is a A-bounded, 9-level normal form verifier, then VOOM satisfies the 
following: for alln > Co and N = 2", 


1. (Completeness) If Vy has a value-1 PCC strategy, then VOOM? has a value-1 PCC strategy. 


2. (Soundness) & (VOOR, $) > max {E Vy, parmi 


12.1 Proof of Theorem 12.1 


Recall the following Turing machines. 


1. ComputelntroVerifier takes as input a tuple (V, A, £) and returns in polynomial time a description of 
the introspective verifier VINTO corresponding to the verifier V and parameters (A, £). (See Theo- 
rem 8.3.) 


2. ComputeAnsRedVerifier takes input (V, A, u, y) and returns in polynomial time a description of the 
answer reduced verifier V“® corresponding to V and parameters (A, y, y). (See Theorem 10.27.) 


3. ComputeRepeatedVerifier takes input (V, À, T) and returns in polynomial time a description of the 
anchored repeated verifier VÈ® corresponding to V and a number of repetitions k(n) = (An)*. (See 
Theorem 11.4.) 


164 


We specify the Turing machine Compress in Fig. 15. The universal constants 4 and y are specified in 
Eq. (158), and the universal constant T is specified in Eq. (163). 


Input: (V,A) where V = (S,D) is a pair of Turing machines and À is an integer. 
1. Compute V) = ComputelntroVerifier(V, A,9). 
2. Compute V) = ComputeAnsRedVerifier(V™, A, u, 7). 
3. Compute VS) = ComputeRepeatedVerifier(V), A, fay 


4. Return VER — Yl), 


Figure 15: The definition of the Turing machine Compress, where the universal constants p and y are 
specified in Eq. (158), and the universal constant T is specified in Eq. (163). 


Lemma 12.2. Let Compress be the Turing machine specified in Fig. 15. For all pairs of Turing ma- 
chines V = (S,D) and integer À, the output of Compress(V,A) is a normal form verifier YOOMPR = 
(SCOMPR, DOOMPR) such that SCMP® does not depend on YV but only on the parameter À. Furthermore, 
there is a Turing machine ComputeSampler that on input A outputs the description of the compressed sam- 
pler SCOMPR in time polylog(A). 


Proof. The verifier YOOMP® = V(S) is a normal form verifier because the introspective verifier VC?) is a nor- 
mal form verifier (even if the input V is not; see Remark 8.7), and the verifier transformations ComputeAnsRedVerifier, 
ComputeRepeatedVerifier preserve the normal form (by Theorems 10.27 and 11.4). 

The introspective verifier computed in Step | of Figure 15 has a sampler S (1) that only depends on the 
parameter A, as implied by Lemma 8.1. The answer reduced verifier from Step 2 has a sampler S (2) that, 
according to Theorem 10.27, only depends on the sampler S (1) the parameter tuple (A, u, y), and the de- 
scription length D0) |. Note that S (1) only depends on A, the parameters (u, y) are universal constants, and 
the description length of DC) is at most the time complexity of ComputelntroVerifier on input (VO, A,9), 
which is polynomial in A. Thus S (2) only depends on A. Similarly, the sampler S (3) of the repeated verifier 
from Step 3 only depends on the sampler S (2) the parameter A, and the universal constant T, as implied by 
Theorem 11.4. Therefore the final sampler SCOM’® = S®) only depends on A. 

Let V* be the pair of trivial Turing machines (0,0). Define ComputeSampler as the Turing machine 
that on input A computes (S, DPYMMY) = Compress(V*, A) and returns the description of S as the output. 
By the above discussion, the description of S is the same as that of S©°™”® and the time complexity of 
ComputeSampler is polylog(A) as Compress runs in polynomial time: this follows from the fact that the 
Turing machines ComputelntroVerifier, ComputeAnsRedVerifier, and ComputeRepeatedVerifier all run in 
polynomial time. O 


Proof of Theorem 12.1. We evaluate the parameters of the verifiers generated in each step of the Compress 
procedure. These parameters are summarized in Table 2 and are explained in the following. Let (V, A) be 
the input to the Turing machine Compress. We write N to denote 2”. 


Introspection. The verifier V®) = (SY, DM) is the introspective verifier corresponding to V and pa- 


rameters A and £ = 9 computed by ComputelntroVerifier. We state its parameters and properties in the 
second row of Table 2 in terms of the index n € N. 


165 


Time complexity 


Level Completeness Soundness 
Sampler Decider 


VO) poly(n,A) poly(N*) 5 value-1 PCC é(v, 1—e)>m 
y2) poly(n, A) poly(n, A) 7 value-1 PCC &(ve?, {= £2) > ev, 1— £1) 
Ve) poly(n,A) poly(n, A) 9 value-1 PCC é(v?? 1) >e 


Table 2: Parameters and properties of the input verifier V, introspective verifier VY), answer reduced verifier 
v2), and parallel repeated verifier y6) computed by Compress. The quantities £1, €2 (which are functions 
of n) are specified below in the main text. 


The complexity bounds and the number of levels of the introspective verifier are given by Theorem 8.3. 
Using that the argument £ in ComputelntroVerifier is set to 9, there exists a universal constant Ciyrro > 1 
such that for all A € IN and for all n > Cyyrpgo, we have 


TIME sq) (n) < (A A n) Osteo | 
TIME pa) (n) < 206047 , (155) 
[DO] < Ciuro AE" , 


By Theorem 8.3 these bounds hold for all inputs (V, A, 9) (in particular these bounds do not assume that the 
input V is A-bounded or is even a normal form verifier). 

The statements in the completeness and soundness entries of the row assume that the input verifier V 
is 9-level and A-bounded. The completeness entry follows from Theorem 8.3 because in the completeness 
case we assume the verifier V has a value-1 PCC strategy, and thus the resulting introspective verifier has a 
value-1 PCC strategy. We deduce the soundness entry as follows. Let the function ô(e, n) from Theorem 8.3 
be denoted by 61 (€1,1). Let a1 > 0,0 < bı < 1 be the universal constants such that 


5i(e1,n) =m (an) ey (An) (156) 
By setting 
m 1 Arei 1/b 
aclap) e Aem ast 


we get for all n > Cy, 


Ze —)+3 

— 8a1 4 
1 

<7’ 


where we used the fact that A > 1 in the second line. Thus by the soundness guarantee of Theorem 8.3 we 
get that 


e(v®,1 = €1) > max {E(Vn, ei 


166 


for all n > C; as desired. 


Answer reduction. Let V) = (S(?), D(2)) denote the answer reduced verifier corresponding to V) and 
parameters (A, u, y) computed by ComputeAnsRedVerifier where we define u and y as 


u = [Cino] , y= A (158) 
102 
where a4, b4 are the universal constants defining 6, in Eq. (156), and b> is defined below in Eq. (161). The 
third row in Table 2 gives the properties of the answer reduced verifier. 
The bounds on the number of levels and the time complexities follow from Theorem 10.27: the number 
of levels of the introspective verifier V(? is 5, and therefore the number of levels of the sampler S (2) is 7. 
Furthermore, using the bounds on the complexities of S (1), D), and the bound on the description length 
IDM | from Eq. (155), we get that assumption (134) is satisfied and thus by Theorem 10.27 there exists a 
universal constant Car > 1 such that for all n > Car, 


max{ TIME sa) (n), TIMEp (7) } < (Aen) , (159) 


The constant Car depends on the universal constants p,y. Again, these time complexities do not depend 
on whether the starting verifier V is A-bounded. 

For the soundness and completeness entries, we assume that the starting verifier V is A-bounded. If we 
further assume that V has a value-1 PCC strategy, then V has a value-1 PCC strategy (by the completeness 
guarantee of the introspective verifier), and then the completeness part of Theorem 10.27 guarantees that 
V2) also has a value-1 PCC strategy. 

The soundness entry is argued as follows. Define d2(€2,n) to be the function d(e,n) from Theo- 
rem 10.27; write it as 


zlean) =a y" ((A- [DO | n) -e8 + (A- [DO] n) =T) (160) 


for universal constants a’ > 0 and 0 < b’ < 1. By our bound on the description length of DY from 
Eq. (155), we have that 62 can be bounded by 


özlen n) < aa((An)® -e + (An)-®) 
where we set 
l2 = max{a'y”, a'y” (Curro), (1+ Cintro) Ha’ } , b =b, C2 = pb'y . (161) 


Since a’, b’, u, y and Cintero are universal constants, so are a2, bz, c2. By setting 


= Ey 1/by 7 b/a ie 
a= (aaa) pCa = (4m)™™ (8m1) (162) 


167 


we get for all n > Co, 


(ezn) < a((An)® en (An) =) u1) 
= m- (An)® - £8 + az - (An) PeT! - ey 
= a- (An)? - ef + a - (An)~(27—-%1/P1) (841) 1/1 . e4 (Definition of £1) 
< a- (An)” ele + ag (An)~™/"1 (8a, )1/ - e1 (Definition of y) 
< m(=) +2 (n > Co, A > 1, def. of e2) 


< £1. 
Thus by the soundness guarantee of Theorem 10.27 we get that 
eF =a) > &(V,1-e1) 


for all n > Cz as desired. 


Parallel repetition. Let V) = (S°),D)) denote the anchored repetition (see Section 11.2) of V2) 
computed by ComputeRepeatedVerifier, with parameters A and T defined below in Eq. (163). We state the 
parameters and properties of VO) in the fourth row of Table 2. 

The number of levels of S) is 9 by Item 3 of Theorem 11.4, because the number of levels of ye) 
is 7. The time complexities of S (3) and DC) follow from Item 3 of Theorem 11.4 and the complexity 
upper bounds on S (2) and D®) specified in the third row of Table 2; in particular this uses the fact that 
TIME p(n) < (A-n)oa® < (A- n)" by the way we set T in Eq. (163). These bounds only depend on 
the bounds stated for verifier V(2) in the third row of Table 2, and again do not depend on whether the input 
verifier V is A-bounded. 

For the soundness and completeness entries, we assume that the starting verifier V is A-bounded. The 
statement in the completeness entry follows from Item 1 of Theorem 11.4, where we assume that V and thus 
VO and thus V() all have value-1 PCC strategies; therefore V) would have a value-1 PCC strategy as 
well. 

The soundness entry is argued as follows. Let c3, c4 > 0 denote the universal constants c, c’ from Item 2 
of Theorem 11.4. Recalling that £2 (defined in Eq. (162)) is a function of n and A, define T to be the 
minimum integer satisfying T > Car and 


(163) 


for all n > T and for all integer A > 1. The integer T is well-defined because the right-hand side is upper- 
bounded by a fixed polynomial in A. Setting k(n) = (An)(1+)7, we have that for all n > max{t, Car} 


4 7 4 1 
— exp (—cs ey k(n) / (An) ) = — exp (-cs ey (An)*) <= 
E2 E2 2 

where we used our definition of k(n) and the definition of T. Thus by the soundness guarantee of Theo- 
rem 11.4 we get that 


e(v®, 5) = EV), 1— £2) 


for all n > max{T, Car} as desired. 


168 


Putting everything together. The verifier VO™”® is VS). We now put together the bounds and parame- 
ters from Table 2 to obtain the conclusions stated in Theorem 12.1. 

The claim that the Turing machine Compress is polynomial time follows from the fact that the Turing 
machines ComputelntroVerifier, ComputeAnsRedVerifier, and ComputeRepeatedVerifier all run in poly- 
nomial time. 

The claimed number of levels and time complexity of the sampler SCOR follows from the complexity 
bounds in Table 2, which in turn only depend on the parameter A. In particular the sampler S©OM?® is 
independent of the input verifier V, as shown in Lemma 12.2. Finally, the claimed time complexity to 
compute the description of SCOPRE also follows from Lemma 12.2. 

The claimed time complexity of D©°™?® in the theorem statement also follows from Table 2. 

We now establish the completeness and soundness properties of Y©OM?®, Assume that the input verifier 
Y is 9-level and A-bounded. The completeness property (Item | in Theorem 12.1) follows from chaining 
together the completeness properties of vo, P, wY and the assumption that Vy has a value-1 PCC 
strategy. 

We now establish the soundness property (Item 2 in Theorem 12.1). Set 


Co = max{C;, C2, CAR, T} A (164) 


Since C1, C2, CAR, T are all universal constants, Cg is also a universal constant. By chaining together the 


(3) y2) yy) 


soundness properties of Vi, Vj", Vi’, we get that 
1 1 
e(O, 5) 2 eV, 1-22) > (V1 — £1) > max {80V 5), 20} 


for all n > Cp as desired. This concludes the proof of the theorem. O 


12.2 An MIP* protocol for the Halting problem 


For every Turing machine M, we construct a nonlocal game 6 such that val*(6) = 1 if M halts and 
val*(6) < 5 otherwise. In what follows, to explicitly disambiguate between a Turing machine (which is a 
tuple consisting of a set of states, transition rules, etc) and its description (which is a binary string), we use 
calligraphic letters to denote the Turing machine M and an overline M to denote a description of M (see 
Section 3.1 for a discussion of Turing machines and their descriptions). 

First, we define a Turing machine F as in Fig. 16. For all Turing machines M and parameters A € IN 
we define the decider DHS to be the 5-input Turing machine 


DMa (1, x, y,4a,b) = F(F,M, à, n, x,y,a, b) 


i.e., on input (n, x,y,a,b) runs F with input (F, M, A,n, x,y,a,b). We generally omit the subscripts 
M,A for notational simplicity and denote the decider as DĦ^T, Define SFT = ComputeSampler(A) 
and VHAtT — (SHALT DHALT) | By inspecting the definition of the Turing machine F in Fig. 16, one can 
see that the description D computed by D#“!* at Step 2 is the description of DH^ itself.’ Therefore, the 
input to the Compress Turing machine at Step 4 is the description of VĦ^ and A. 

We note that for all Turing machines M and integers A, the Turing machine DĦ^T halts on all inputs. 
This is because each step of F terminates in some finite time. This uses that the procedures ComputeSampler 


35 Alternatively, we can use Kleene’s recursion theorem [Kle54] here so that D#! is the fixed point of a Turing machine 
similarly defined as F. This approach was taken in an earlier version of the paper. We choose to present the proof without resorting 
to Kleene’s recursion theorem for the sake of simplicity. 


169 


Input: (R, M,A,n,x, y,a,b) where R is the description of an 8-input Turing machine and M 
is a description of a Turing machine M. 


1. Run M on the blank input for n steps. Accept if it halts and continue otherwise. 
2. Compute the description D of the 5-input Turing machine D defined as follows: 
D(n', yy tO = R(R, M,A, n',x',y',a',b') 
i.e., on input (n’, x',y',a',b') runs R with input (R,M,A,n', x',y',a',b'). 
3. Compute the description S = ComputeSampler(A). 
4. Compute the description YOOM?® = Compress(V, A) where V = (S, D). 


5. Accepts if the decider D©™”® of verifier YOO™?® accepts (n, x, y, a, b). 


Figure 16: Description of Turing machine F. The Turing machines ComputeSampler and Compress are 
given in Lemma 12.2 and Theorem 12.1. 


and Compress both terminate in finite time, and the decider D©°™?® terminates in finite time (as given by 
Theorem 12.1), regardless of what input is given to Compress. Furthermore, the output of ComputeSampler 
is always a sampler, which means that V#4'T = (SHALT, DĦALT) is by construction a normal form verifier. 
We have the following lemma for YHA, 


Lemma 12.3. Let M be a Turing machine and À integer. Then the verifier V4" corresponding to M and 
A has the following properties. For alln € N, 


1. If M halts in n steps then val*(VEA'") = 1. In this case, there is a value-1 PCC strategy for the 
game VIAT, 


2. If M does not halt in n steps then V2" has a value-1 PCC strategy if and only if VOOMP® does. 
Furthermore, under the same assumption (that M does not halt in n steps) it holds that 


eve; 5) pan ey D : 
Proof. Suppose M halts in n steps. Then from the definition of F and DH^, it follows that D4“! accepts 
on input (n, x, y,a, b) for all x, y,a, b, and hence val*(Vi#4'T) = 1. A value-1 PCC strategy for VH^ is 
the following trivial strategy: for all questions, the players always return a fixed answer (e.g. 0). 

If M does not halt in n steps, then D#4" accepts on input (n, x, y, a, b) if and only if D©OMP® accepts 
on (n,x,y,a,b). Since VEA'T and VOOMPR use the same sampler SĦ^T = ComputeSampler(A) by 
construction, any strategy for VH^LT is a strategy (with the same winning proabbility) for VOOMP® and vice 
versa. This implies that there is a value-1 PCC strategy for VHA if and only if there is a value-1 PCC 
strategy for VCOMPR, The “Furthermore” statement also follows directly from this. oO 


We show in Lemma 12.5 below that for all Turing machine M there is a choice of A such that the 
corresponding VHAlT — (SHALT DHALT) is \-bounded. We first show a technical lemma. 


170 


Lemma 12.4. Suppose C,C' € N are integers. Then for all A > 4max{(4C)®©, Clog C’} and n > 2, 
CC Any < nô. 


Proof. We aim to find a function f(C,C’) such that for all A > f(C,C’), it holds that C(C’- An)© < nô 
for all n > 2. For this it suffices that 


A > 4max{log C,C, Clog C’, Clog A} for all A > f(C,C’) (165) 
This is because (165) implies that A > log C + C + Clog C’ + Clog A which, by exponentiating both 
sides, gives 


or > glog C+C+C log C’+C log A = C.2C. (cae 


Since n*~© > 24 for all n > 2, we get that n*~© > C(C’A)© which is the desired inequality. It is easy 
to verify that for all A > (4C)8°, we have A > 4C log A. Thus, setting 


f(C,C’) = 4max{(4c)*®©, ClogC’} ,, 
we obtain (165), which concludes the lemma. O 


Lemma 12.5. There is an integer parameter À, polynomial-time computable from the description of the 
Turning machine M, and scaling as poly(|M]), such that the verifier V3" ys to M and 
parameter A is A-bounded and moreover TIMEpuar (n), TIME suar(n) = poly(n 


Proof. We first do an accounting of the time complexity of DĦ^LT, By definition, for all M and A, the 
decider D#“'T runs F on input (F,M,A,n,x,y,a,b). Its running time is therefore a polynomial in the 
time bounds of the following steps. 


) time. 


) time. 


3. Computing the description SCR; this takes polylog(A) time (by Lemma 12.2). 


) time (by Theorem 12.1). 


5. Executing DE™®R on input (n, x,y, a,b): this takes poly(n, A) time (by Theorem 12.1). 


, log A), we have that the description length of D is bounded 


by |D| < : Similarly, the description length of the sampler is bounded by |S |< 
polylog(A) as pian runs in eae time in the bit-length of A. Therefore, the size of the 
vty a ). So the time complexity bound in Item 4 is at most 


). We ee a we did not use the A-boundedness of the verifier V, as this is 
ae we want to an here! Overall the total time complexity of DĦ^LT can be bounded by 


TIME pur (n) < Co(| M]: |F| -A-n)®, (166) 


for some universal constant Cp > 0. By Lemma 12.4, we have for all A > Ag := 4max{ (4Cp)8, Cp log( [M] . 
|F|)} and n > 2, we have TIMEpaar(n) < n^. 


171 


Similarly, by Theorem 12.1 the sampler S#‘7 has running time poly(n,A) and therefore there is a 
universal constant Cs > 0 such that 


TIME suar(n) < Cs(A-n)S, (167) 


which, again by Lemma 12.4, is at most nô for A > Ay := 4- (4Cs)8° and n > 2. 
As we have shown || = poly(|M|,|#|,log A) and the description of the verifier V is the same as 


that of VHĦ^LT, there exists an integer Cy > 0 such that 


vaa] < Cy((M] -|F| -log ay. 


Define Az := (|M| . Fe - 2?Cv. One can verify that for all A > Àz, we have that ee <A. 

Finally, taking A = | max{Ao, M1, Az} | completes the proof. The quantity |F| is a universal constant, 
because the Turing machine F does not depend on A or M. The quantities Cp, Cs, Cy are universal con- 
stants; therefore Ag, A1, Az and thus A can be computed from |M]. It is clear that A = poly(| M|). Finally, 
for this value of A one immediately verifies from (166) and (166) that TIME puar(1), TIMEsuar(n) = 
poly(n, | M|), as claimed. 


O 
Putting things together we obtain the following result. 


Theorem 12.6. There exists a polynomial-time computable map from Turing machines M to games © that 
satisfies the “efficiency” requirement from Definition 5.29 and is such that 


1. If M halts on the empty input, then val" (6) = 1. Moreover, in this case there is a (finite-dimensional) 
PCC strategy for the players in © that wins with certainty. 


2. If M does not halt on the empty input, then val" (6) < 7 


Proof. The map from Turing machines to games is computed as follows. Given a Turing machine M, first 
compute the parameter A as given by Lemma 12.5. This takes time at most poly(|M ), and furthermore 
A = poly(|M|). Then, compute the description of the verifier YVHA4'T = (SHT, DHEALT) corresponding 
to M and À by first computing the description of SĦ^T = ComputeSampler(A) and then computing the 
description of the Turing machine D4" that computes 


DEALT (n, x, y,a,b) = F(F,M,n,x,y,0,b) . 


Computing the description of V4" thus takes time polylog(A) + poly(| M|) which is poly(|M]). Fi- 
nally, output the description of the game 6 = ve where Co is the universal constant given by Theo- 
rem 12.1. Clearly this computation takes poly(|M]|) time in total and, according to the “moreover” part of 
Lemma 12.5, returns a sampler and decider for 6 that run in time poly( |M| ), as required. 

Now fix a Turing machine M and let 6, denote the n-th game VHA of the verifier V#4'" computed 
by the aforementioned map (so in particular the output game 6 is Gc, ). Suppose that M halts on the empty 
input; let T be the minimum number of time steps required for M to halt on the empty input. Observe 
that for all n > T, by Lemma 12.3 it holds that 6, has a value-1 PCC strategy. We will use this to show 
inductively that 6, also has a value-1 PCC strategy, for all Co < n < T. 


Claim 12.7. Let Co < n < T. Suppose that Ön has a value 1 PCC strategy for allm > n. Then ©, also 
has a value-1 PCC strategy. 


172 


Proof. Since by assumption M does not halt in n steps, by Lemma 12.3 it holds that 6, has a value-1 PCC 
strategy if GOOMPR does. Since VHA" is A-bounded (by Lemma 12.5) and n > Co, by Theorem 12.1 it 
follows that 6°™"® has a value-1 PCC strategy if 6: does. Since 2” > n, this is true by the hypothesis of 
the claim. Thus, 6, has a value-1 PCC strategy as claimed. O 


By Claim 12.7 and downwards induction on n (with the base case n = T), we have that 6, has a value-1 
PCC strategy for all n > Co. In particular, we have val* (6) = val*(6c,) = 1. This shows the first item 
in the theorem statement. 

Now suppose that M does not halt on the empty input. We have that for all n € IN: 


1 1 1 
E(Gn, 3 = E (GÇOMPR, a) > E (6y, 3) a 


where the equality follows from Lemma 12.3 and the inequality follows from Theorem 12.1 (again using 
the A-bounded property of VH^LT), By induction, we get that for all k € N, 


&(G, = 


3) = 8 (Gc5) 2 8 (Soa) 5) 7 6 (BERNE 5) > 528) 


2 9(&)2/ = 2 

where g*)(-) is the k-fold composition of the function ¢(7) = 2” and the second inequality follows from 
Theorem 12.1 again. Since g(-) is a strictly monotonically increasing function this implies that there is no 
finite upper bound on &(6, 5) and therefore every finite dimensional strategy for the game 6 must have 
success probability at most 5. O 


Recall the definition of the complexity class RE, which stands for the set of recursively enumerable 
languages (also called Turing-recognizable languages). Precisely, a language L C {0,1}* is in RE if 
and only if there exists a Turing machine M such that if x € L, then M(x) halts and outputs 1, and if 
x É L, then either M(x) outputs 0 or it does not halt. The Halting Problem is the language that contains 
descriptions of Turing machines that halt on the empty input input. The following well-known lemma shows 
that the Halting Problem is complete for RE, meaning that every language L € RE can be polynomial-time 
reduced to the Halting Problem. We include the simple proof for convenience. 


Lemma 12.8. The Halting Problem is complete for RE via Karp reductions.°© 
Proof. To see that the Halting Problem is in RE, define M to take as input an x that represents a Turing 
machine M = [x], and runs the universal Turing machine to simulate NV on the empty input; if M halts with 
a 1 then so does M. 

To show that the Halting problem is complete for RE, let L € RE and M a Turing machine such that if 
x € L, then M(x) halts and outputs 1. For an input x, let My be the following Turing machine. My first 
runs M on input x. If M accepts, then My accepts. On all other outcomes, My goes into an infinite loop. 
Thus NV, halts if and only if x € L. o 


Corollary 12.9. MIP* = RE. 


Proof. Since the Halting Problem is complete for RE, and by Theorem 12.6 is contained in MIPŤ 4 72 & 
MIP* (see Definition 5.29), we have the inclusion RE C MIP*. The reverse inclusion MIP* C RE follows 
from the following observation. Let L € MIP* . From the definition of MIP* (see e.g. [VW16, Section 


36A Karp reduction between languages L and K is a polynomial-time Turing machine R such that x € L if and only if R(x) €K 
for all instances x € {0,1}*. 


173 


6.1], from which we borrow the terminology used here) it follows that there exists a polynomial-time Turing 
machine R such that for all x € {0,1}*, R(x) is the description of an m-turn verifier Vy interacting with k 
provers, where m and k are both polynomial functions of |x| and such that 


val*(V;)>2/3  ifxeL 
val*(V;) < 1/3 ifx é L 


Consider the following Turing machine A: on input x, it computes Vy = R(x), and then exhaustively 
searches over tensor-product strategies of increasing dimension and increasing accuracy to evaluate a lower 
bound on val*(V;.). If val“ (Vx) > 2/3, then for arbitrarily small ô there exists a finite dimensional tensor- 
product strategy for the players that achieves value 2/3 — ô > 1/3. When the Turing machine A identifies 
such a strategy it terminates, outputting 1. If there is no such strategy, then A never halts. This implies that 
L € RE. O 


12.3 An explicit separation 


As discussed in Section 1.3, Theorem 12.6 implies that C4a, the set of approximately finite-dimensional 
correlations, is a strict subset of Cgc, the set of commuting-operator correlations. This is because if Cga = 
Cyc, then there exists an algorithm to approximate the entangled value of a given nonlocal game 6 to 
arbitrary accuracy. On the other hand, Theorem 12.6 shows that deciding whether a game has entangled 
value 1 or at most 1/2, promised that one is the case, is undecidable. Therefore the correlation sets must be 
different. 

In fact, Theorem 12.6 implies that there is an infinite family .W of Turing machines that do not halt on 
an empty input such that for all M E€ .@, the corresponding game 6pm has val* (6m) < val® (6m), 
where recall that val® (6m) denotes the commuting-operator value of 6,4, which is the supremum of 
success probabilities over all commuting-operator strategies for G,,.°’ However, it is not immediately 
clear, given a specific non-halting Turing machine M, whether the associated game 6, , separates the 
commuting-operator model from the tensor product model of strategies. While Theorem 12.6 implies that 
val*(G\y) < 4, it could also be the case that val? (6m) = val* (6 m) in that particular instance. We 
conjecture that val® (64) = 1 for all non-halting Turing machines M, but it appears to be difficult to 
identify an explicit value-1 commuting operator strategy that demonstrates this. 

In this section we identify an explicit game 6 that separates the tensor product model from the commuting- 
operator model; we show that val* (6) < 5 but val® (6) = 1. Interestingly, the proof does not exhibit an 
explicit value-1 commuting-operator strategy for 6. 

We construct the separating game in a similar manner to the games constructed in Section 12.2. Let 
A denote the following 1-input Turing machine: it takes as input a description of a nonlocal game 6 and 
runs the semidefinite programming hierarchy of [NPA08, DLTW08] to compute a non-increasing sequence 
&1, &2, . . . of upper bounds on val® (6) such that limy—oo &n = val (6). The Turing machine A halts if 
it obtains a bound æy < 1. Thus this algorithm eventually halts whenever val°°(G) < 1, and otherwise it 
runs forever. 

Consider the Turing machine M in Figure 17. We follow the same steps as in Section 12.2. Let De 
be the decider that on input (1,x,y,a,b), runs M on input (NV,A,n,x,y,a,b). We sometimes leave the 


37To see why this holds, observe that if it were the case that val* (6 4) = val©°(G,,) for all but finitely many non-halting 
M then we could construct an algorithm A to decide the Halting problem as follows. On input M, A first checks if M is one 
of the finitely many Turing machines for which val* (6 4) < val©°(6,,); if so, then it outputs a hard-coded answer for whether 
M halts on the empty tape or not. Otherwise, A computes the nonlocal game 6 ,, and executes the aforementioned algorithm for 
approximating the entangled value of games assuming that Cga = Cgc. 


174 


parameter A implicit. Define SSE? = ComputeSampler(A) and VSE = (SSEP, DS=?). By the definition of 
Turing machine WV in Fig. 17, the description V computed by DS™ is the description of VS™ itself and the 
input to Compress is the description of VS? and A. Define the game 6S? = yE where Cp is the constant 


from Theorem 12.1. Then 6°"? is the game 6 computed in DS®, 
A similar argument to that of Lemma 12.5 shows that there exist choices of A (computable from the 
description of A) such that VS® is A-bounded. 


Input: (R, À, n, X, Y,a, b) where R is the description of a 7-input Turing machine. 
1. Compute the description S = ComputeSampler(A). 


2. Compute the description D of the 5-input Turing machine D that on input (n’, x’, y’,a’, b’) 
runs R with input (R, A,n’, x’, y’,a',b'). Let V = (S,D). 


3. Compute the description of the game 6 = Vc, where Co is the integer from Theorem 12.1. 
4. Simulate A on input 6 for n steps and accept if A halts. 
5. Compute the description YCOM?® = Compress(V, A). 


6. Accept if the decider DOP of verifier VOOM accepts (n, x,y, a, b). 


Figure 17: Description of Turing machine M. The Turing machines ComputeSampler and Compress are 
given in Lemma 12.2 and Theorem 12.1. 


Theorem 12.10. For the game 65° = V8" it holds that val (65™) < 5 and val® (65™) = 1. 


Proof. Suppose that val°°(@S"?) = 1. The invocation of the Turing machine A on input 65? never halts, 
and therefore the decider DSE? never accepts in Step 4 of Figure 17. Applying Theorem 12.1, we get that 
E (VEEP, 5) > (VA, 5) and 


e( v5", 5) > jal 


for all n > Co. An inductive argument implies that there is no finite upper bound on £ (V$E?, $), and thus 
val” (651) = val* (VF?) < 4, which implies the theorem. 

On the other hand, suppose that val (GS?) < 1. Then there exists some m > Co such that A 
halts on input 6S1? after m steps, so VS"? has a value-1 PCC strategy for all n > m (i.e. the players do 
not respond with any answers). Thus by the completeness statement of Theorem 12.1 and an induction 
argument analogous to the one in the proof of Theorem 12.6, we have that V$® has a value-1 PCC strategy 
for all n > Co, which implies that val* (651?) = 1, a contradiction because of val* (65?) < val (G5). 


This completes the theorem. O 


175 


A Analysis of the Pauli basis test 


In this appendix we give a proof of Theorem 7.14. For convenience, we repeat the statement of the theorem 
here. 


Theorem A.1. There exists a function 
Sop (e,m,d,q) = a(md)"(e® +g” +2->m4) 


for universal constants a > 1 and 0 < b < 1 such that the following holds. For all admissible parameter 
tuples qldparams = (q,m,d) and for all strategies Z = (|p), A, B) for the game O aans that succeed 
with probability at least 1 — e, there exist local isometries ġa : Ha > Hy SHar, bp: Ha > Hp S Hp 
(where |Y) € Ha ® Heg and Han, Hgo S (C1)®™ with M = 2") and a state |AUX) € Hy ® Hy such 


that 
1._||pa 8 fely) — |AUX) ® |EPR;)®™|| < doLn(e, m, d, q), 


2. Letting AX = p4 Ax o', and BY = opBi oh, we have for W € {X, Z} 


x (PAULLW) eF Ww 
Ay Q Ip'g” X sgLD (T, Ja” Q Iyg'g” 
p (PAULLW) W 
Iyar Q By ~daLp In arg Q (T Jg” j 


&M 


where the X5yp statement holds with respect to the state |AUX) yg! Q |EPRq) kng and the answer 


summation is over u € FM. 


We also recall the meaning of the parameters: q is the field size, m the number of variables, d the degree 
of the low-degree code, and M = 2” is the number of |EPR,) states being tested for. 

This appendix is structured as follows. First, we establish some preliminary facts and definitions in 
Appendix A.1. In the remainder of the appendix, we show Theorem A.1 by showing how to construct a 
representation of the Pauli group on the provers’ Hilbert space, starting from a strategy .Y that succeeds 
in G ie with high probability. This is done in several stages: in Appendix A.2, we use the strategy 
measurements to define a set of observables that correspond to a subset of the Pauli X and Z observables, 
and show that these approximately satisfy the associated Pauli group relations. In Appendix A.3, we adjoin 
ancillary registers containing EPR states to the provers’ state |y}, and define a new set of approximately- 
commuting observables acting on the expanded state. In Appendix A.4, we combine the approximately- 
commuting observables into a strategy for the classical low-degree test, and in Appendix A.5, we apply 
the soundness theorem of the classical low-degree test to construct a single projective measurement that 
simultaneously measures all of the approximately-commuting observables. Finally, in Appendix A.6, we 
use this single simultaneous measurement to construct an exact representation of the Pauli group, show that 
it is close to the original strategy observables on the expanded state, and deduce that the game GPAUL! 


qldparams 
a self-test for the strategy ./ PAUL, 


A.1 Preliminaries 


Definition A.2. Given integer d,m and a prime power g we denote by idegg m (IF,) the set of polynomials 
in m variables with individual degree at most d in each variable. When the field is clear from context we 
write ideg dm: When the number of variables is clear from context we write ideg,. We denote by deg, the 
set of polynomials with total degree at most d. 


176 


Recall the low-degree code defined in Section 3.4, in which vectors h € EM get mapped to polynomials 


gn € ideg, „(Fq) such that for every u € {0,1}, gn(u) = hy where the coordinates of h are indexed 
by binary strings of length m. For x € FF, recall the definition of the vector indm(x) € IF from 
Section 3.4. Recall also the finite field trace tr(-) = try,2(-) : Fy — Fo, where q = 2 for odd k, defined 
in Section 3.3.1. 


Call a tuple w = (ux,Uz,1x,1z) € (F7)? x (F,)? anticommuting if the quantity 


t= tr((rx S ind (ux)) ` (rz i indm(uz))) 
introduced in Equation (54) is not 0, and commuting if y = 0. 


Fact A.3. The probability that a tuple w = (ux, uz,rx,rz) sampled uniformly at random from (F7)? x 


(F,)? is anti-commuting is at least 
aif “4 md\ 1 3md\ 1 
= ‘ = . se | ee nn 
Gag" ea 7)-( = >(1 Jz 


2 
The probability that w is commuting is also at least ( = amd) : 7 


Proof. First, note that ind), (ux) = 0 if and only if ux = 0. If ux = 0 then w is commuting. Otherwise, if 
ux # 0, then observe that for all rx, rz, Uz, 


tr((rx : ind), (ux)) i (rz $ indm(uz))) = tr(rxrz(indm(ux) : ind, (uz))) 
= tt((7X7Z) ` Sindy (ux) (UZ) » 


where Yind,, (ux) 18 a nonzero polynomial in ideg; ,,, (Fy). 

If rx = 0, then w is commuting. Condition on rx Æ 0. Since Sindn (ux) Æ 0 has total degree at most md, 
by the Schwartz-Zippel lemma the probability that ging,, (ux) (uz) is zero over a uniformly random choice of 
uz is at most md/q. Conditioned on ging, (ux) (Uz) # 0, the product (rxrz) > Sindy, (ux) (Uz) is a uniformly 
random element of F} when rz is chosen uniformly at random. 

The probability that ux 4 0,rx Æ 0, ind, (ux) (uz) # 0 is at least (1 — q7™) - (1 — q-')-(1—md/q) 
since Ux, rx, Uz are chosen independently. 

Since q = 2! is an admissible field size, we have that for all a € IF,, the trace of a is equal to )/j a; 
where (a1, ..., a1) = «(a) € F$ are defined in Section 3.3.1. Thus the probability that a uniformly random 
element of F, has trace 0 is exactly 1/2. O 


In this appendix we often use the “~” notation to specify closeness of general operators (not necessarily 
POVMs) that are indexed by questions but not answers: given sets of operators {A*} and {B*} indexed by 
questions x € X, we write A* ~s B* to denote 


E KICA" = BYY'(A*— BY)|p) < 0(8) 


where j/ is a question distribution and |y) is a state that has been fixed by context. This will be useful for 
expressing closeness of observables, for example. The following two lemmas relate the closeness between 
sets of operators and weighted sums of those operators. 


Lemma A.4. Let {A*} and {B*} be operators indexed by some finite set X, and let u be a probability 
distribution over X. Let {xx} be a set of complex numbers such that |x| < 1. Then if AX ~ 5 B* on 
average over x drawn from y, then A ~s B, where A = Ex &;A* and B = [Ex &xB*. 


177 


Proof. Expand 


Eaa- B|} < (Æa BWI) 
E 


2 
< E||(A* — B*)|¥) | 
<6 
The first inequality is the triangle inequality and the second inequality is Jensen’s inequality. O 


Lemma A.5. Let A be a finite set. Let {A*}ac4 and {Bx yac 4 be POVMs and let {tq }ac 4 be a collection 
of complex numbers on the unit circle. Let A” = Yi, &a A% and B* = } , &a B}. Then 


Aj œ~ B} implies A” Sy B* 


forð = |A| - ô. 


Proof. Expand 
EIA" — BOIH)? = E | Z aláz- 8 iv) 


2 
py onl 
EIAI: (Az — BOI 
<|A|-6. 


A 


The first inequality follows from the triangle inequality, the second inequality follows from Cauchy-Schwarz, 
and the last inequality follows from the assumption that A? ~5 B}. O 


The following lemma shows that POVMs that are approximately projective (in the sense of being self- 
consistent) are close to exact projective measurements. The lemma first appears in [KV11]; for a self- 
contained proof, see [JNV* 20]. 


Lemma A.6 (Orthonormalization lemma). Let 0 < ô < 1. There exists a function Nortno(O) = O(6!/ *) 
such that the following holds. Let |p) be a state on Ha ® Hg where Ha, Hp are finite dimensional. Let 
{Qa} and {Ra} be POVM on Ha and Hs respectively such that 


Qa Q Ip S5 Ia 8 Ra. (168) 
Then there exists a projective measurement { P, } such that 
Py 8 Ig ortho Qa ® Ig . (169) 


A.2 Strategies 


Let qldparams = (q, m, d) be an admissible parameter tuple (Definition 7.12) and let Z = (|), M) bea 
strategy for O sra , that succeeds with probability at least 1 — g, where |p} is a bipartite state on registers 


A and B, with associated Hilbert spaces H a and Hpg respectively. For notational convenience we use M to 


178 


denote the operators for both players; it will be clear from context which player’s operators we are referring 
to (for example, we write mero) ~“ & Ip to indicate that MPO) is viewed as an operator acting on 
register A). 

Using Naimark’s theorem (as formulated in [JNV*20, Theorem 5.1]), at the cost of extending the state 
|Y) by adding sufficiently many qubits initialized in the |0) state we may assume that the measurements in 
SF are projective. Let 

lw") = (lp) 8 |0---0) @ |0---0)) an 


be the extended state. In the remainder of this section, we will work with the projective strategy on the 
extended state, which we continue labeling |) (instead of |#’)) for notational convenience. 

We introduce several additional notational shorthands. By definition the strategy .7 contains mea- 
surement operators for each of the possible question types and content summarized in Fig. 8, such as 
fy oma jack, for all y € FF etc. For “line” questions (of type ALINE or DLINE) we will use 
LINE as a formal placeholder for either type ALINE or DLINE; which type is meant will be clear from the 
nature of the line. Moreover, we will use £ to denote the description of either an axis-parallel line (given 
by a pair £ = (uọ,s)) or a diagonal line (given by a triple £ = (ug,s,v)). We also introduce the following 
collection of observables constructed from the measurement operators in the strategy. For every u € F”, 
r € F4, and W € {X, Z}, define 


w= eg aa E, (170) 
ack, 
The observable W” (u) acts on register A or B depending on whether veo) “is viewed as being an 


operator acting on register A or B; this will be made clear from context. 
Recall the definitions of (anti-)commuting tuple given at the start of Section A.1. 


Lemma A.7. Assume that Z succeeds in Crane with probability at least 1 — e, for some £ > 0. Then 
the following hold: 


PAULI 


qldparams’ we 


1. (Consistency check) On average over (t,x) sampled from the question distribution of & 
have that M‘* @ Ip ~: Ia @ Mt”. 


2. (Low-degree check) For all W € {X,Z}, on average over (£,u) drawn from the line-point dis- 
tribution (see Definition 7.6), we have Moneys Q Ig ~: In ® ona) u 


ia , where the answer 
maces [eval,, (-)=a] 
summation is over a € F}. 


3. (Pauli basis consistency check) For allW € {X, Z}, on average over u € Fj’, we have i © 
ao (PAULI,W) (PAULLW) __ (PAULI,W) . 
Ip ~e In Q Mig, (u)=a] where M'g, (u) =a] = LneEM:gy(u)=a M; and Qn is the low-degree en- 
coding of h. 
4. (Commutation check) For all W € {X,Z}, on average over commuting w = (ux, uz, rx, "z), we 
have MEWO ® Ip ~: In Q Mane. where the answer summation is over a € Fo. 


[Bw=a] 


5. (Commutation consistency check) For allW € {X, Z}, on average over commuting w = (ux, uz, rx, "z), 


(POINT,W) uw 
[tr(-rw) =a] 


@ Ip ~, Ia Q MEWO 


we have M , Where the answer summation is over a € Fp. 


179 


6. (Magic square check) For any anticommuting w = (ux,Uz,1x,1z) let Aw be the value of the strategy 
(19), TM ANI S U 5 0 aa }) in the game MS. Then 


w:tr((rx-ux) (rzuz))#0 


7. (Magic square consistency check) For all W € {X,Z}, on average over anticommuting w = 
(ux, Uz, "x, "Z), 


Moa si 2 Ie ~ IQ Mes ae (171) 
MENZE © Ty m I4 0 MON, am) 


where the answer summations are over F>. 


Furthermore, all of the approximations above hold with “~,” instead of “œ”, and they also hold with the 
tensor factors interchanged. 


Proof. Since the question types are sampled according to the type graph GPU" of finite size in Fig. 7 in the 


game Gaerne: the probability of any of the subtests of Dae (described in Fig. 8) being executed is 
at least some universal constant. Thus the consistency conditions expressed in Items 1, 2, 3 of the lemma 
follow directly from the corresponding tests in Fig. 8 (where Item 2 follows from the consistency test that is 


a subtest of DEP ) and the fact that the strategy .Y is assumed to succeed with probability at least 1 — € 


Idparams 
PAULI 
in laparans: 


For the remaining items, recall that by Fact A.3 the probability that a tuple w = (ux, Uz, rx, rz) sampled 
as question content for a question of type PAIR, (PAIR, W), CONSTRAINT; or VARIABLE; is commuting is 
at least (1 — 3md/q) - (1/2), which is at least 1/4 assuming 6md < q. We claim this assumption holds 
without loss of generality. Indeed, observe that the function ô in Theorem A.1 is monotonically increasing 
in a, so without loss of generality we may assume that a > 6. Then it follows that we may assume that 
6md < q without loss of generality, since otherwise, 6 > amd/q > 1 and the conclusion of the theorem 
holds trivially. The same lower bound holds for the probability that w is anticommuting. Therefore, the 
consistency conditions in Items 4, 5 and 7 of the lemma also follow directly from the corresponding tests in 
Fig. 8. Finally, Item 6 in the lemma follows from the fact that the Magic Square check in Fig. 8 is executed 
with constant probability over the choice of a pair of questions. 

That all of the approximations in the Lemma statement also hold with ~, replaced by ~e is immediate 
from Fact 5.18. O 


Lemma A.8. The following hold for the observables defined in (170). 


e ForallW € {X,Z}, re F;, on average over uniformly random u € Fy we have 
W'(u) S Ip ~e Ia @ W' (u). (173) 
e On average over uniformly random w = (ux, uz, rx, "z), 


X™ (ux)Z” (uz) ® Ip 
xz (—1) tC (rx-indm(ux))-(r2-indm (uz))) Zz (uz) X'* (ux) Q Ig F (174) 


180 


Furthermore, all approximations above hold with the tensor factors interchanged. 


Proof. We first establish (173). Fix a W € {X,Z} andr € F}. From item 1 of Lemma A.7 and the data 
processing inequality (Fact 5.24), using that the question type (POINT, W) has constant probability of being 
sampled by SPAH, we get that on average over u € Fy, 


(POINT,W),u (PoINT,W),u 
Micra] © Te Se Ta 8 Mierea” 


where the answer summation is over a € F2. Equation (173) follows directly from Lemma A.5. 
We now establish (174). We have that on average over commuting w = (ux, Uz,rx,ťrz) each of the 
approximate consistency relations hold, for any W € {X, Z}: 


Meee. hea 8 Ip ~: In Q ME (Item 5 of Lemma A.7) 
~e Mia e] © Ip (Item 4) 
~; I, Q Mp i . (Item 1 and Fact 5.24) 


Fact 5.18 then implies that on average over commuting w, 


(POINT,W),uw PAIR,W 
Mite(ry)=a) © B ~e la B Mig, <a) - (aTa) 
Since {Mere} is projective, Equation (175) puts us in a position to apply Lemma 5.25, setting A* if 
Me x Cie = Me. a and Br, = =M, It follows that 
(POINT,X),u Ba Z),u (POINT,Z),u (POINT,X) ,u 
Mr(-rx)=b b] “Mý tr(-rz =c] ý Q Ip & We Mii(rz)=c c] "Mi tr(-r x)= b] 2 Q Ig (176) 


on average over commuting w. Then for every w = (ux, Uz, rx,rz), since the measurements {M}"} are 
projective we get that 
_ (POINT,X),ux (POINT,Z),u 
X™ (ux)Z" (uz) = (1 - amie) (1 — amon) 
_ 7. (POINT,X) ux (POINT,Z),uz (POINT,X),x y ,(POINT,Z),uz 
S1- Mie crat] T Mecra t Miena Mecas o 0 


Applying (176) with b = c = 1 to commute the measurement operators in the the fourth term in (177), and 
performing the same steps in reverse, we deduce that on average over commuting w, 


X° (ux)Z™ (uz) ~e Z' (uz)X™ (ux). 


This shows the “commuting” part of (174). 
Now we consider the case that w is anticommuting. Item 7 of Lemma A.7, combined with Lemma A.5, 
implies that the following two approximations hold on average over anticommuting w: 


X” (ux) Q Ip Ne In Q JM VARIABLE, w , (178) 
Zz (uz) Q Ip Me Is Q MVARIABLES 0 , (179) 
where for j € {1,5}, MVYARIABLR — Mo RO — MIATA 


181 


For all anticommuting w, let €,, = 1 — Aw, where Aw is defined in item 6 of Lemma A.7. From item 6 
of Lemma A.7 we get that Ew anticomm. €w < O(€). For each anticommuting w, Theorem 7.10 implies that 


In ® [M VARIABLE w jp VARIABLES, oe -IQ MYARIABLEs,@ jy VARIABLE] i 
w 


iz 


Thus, averaging over anticommuting w, and applying Jensen’s inequality to the definition of ~s, we ob- 
tain the same anticommutation relation with error Ewanticomm. £12 < (Ewanticomm. €w) 2 = O(e!/2). In 
summary, on average over anticommuting w, 


I, ® MVARIABLEL@ jj VARIABLES w N -I8 MVARIABLES, pyp VARIABLE w ; (180) 
Using 5.19 twice we get from (178) and (179) that on average over anticommuting w, 


X’ (ux)Z'2 (uz) Q Ig Xe In Q MVARIABLEL w pM VARIABLES, , (181) 
Z” (ux) X" (uz) Q Ig Me In Q [M VARIABLES w py VARIABLE, ,w ; (182) 


Using (180) we get that on average over anticommuting w, we have 
X” (ux)Z (uz) @ Ip © yg —Z" (uz)X™* (ux) ® Ig. 


This shows the “anticommuting” part of (174). O 


A.3 Expanding the Hilbert space and defining commuting observables 


Expanded state. We introduce registers A’, A”, B’, B” that are each isomorphic to (C1)®™. Define the 
state 
A M M 
lÊ) aa'arge'e” = (lY) Jas ® |EPRg) Arar ® [EPRg) ggr + (183) 


where we have added maximally entangled states in registers A'A” and B'B”; recall also that the state 
|) is already implicitly appended with sufficiently many ancilla qubits initialized in the |0} state. For the 
remainder of the Appendix all approximations will be evaluated with respect to this state. 


Expanded observables. For all W € {X,Z}, r € F}, and u € Fẹ define the following observable 
Ñ" (u): 

W'(u) = W" (u) @ T” (r -indm(u)), (184) 
where W” (u) is the observable defined in Equation (170) and t™ (r - indm(u)) is the generalized Pauli 


observable (see Section 3.7) acting on (C1 em . 
For a € F; define corresponding projections 


i u =F (—1) (a) Wr" (u) 


rek, 


Note that eon can be equivalently written as a sum of projectors by applying Fourier identities. 
Specifically, for W € {X, Z}, u € F7 and a € F}, define the projector 

q” = T ee = D r : (185) 

hEFM:h-indm(u)=a 


182 


~-(POINT,W),u 


Then we can write Mi as follows: 
matron) ae E (—1)") W (u) 8 t" (r - indy (u)) 
q 

= E (1 ( Tudu a Q ( (—1) ind) ) 
reF, p> É Xe i 

= E (1) ( eg ns S ( (—1) te" oii") 
reFy pa : pa Í 

= E (ere) Meo @ Wu 
a'a" EF; reF; É É 

= 3 MEOSTW) u Q Wu 
a’ a" EE, í í 
a' +a" =a 


where in the second line we have used Equation (170) on the first tensor factor and Equation (20) on the 
second, and in the third line we have used Equation (185). For a fixed u € F”, r € F; and b € F>, define a 
projection 
JA ANEM = D a (PONT W , (186) 
acK,:tr(ar)=b 


which can be equivalently written as 


~,(POINT,W),u, 1 by 

ma jar 5(1+ (-1) W"(u)) (187) 
The registers that the operators W'(u), yoo) ™ and T a act on depend on context. For 
example if W” (u) is viewed as an operator acting on register A, then we will view W"(w) as acting on 
registers AA’ (with the T” observable acting on register A’). We generally indicate via subscripts which 
registers the operators act on (for example we will write (W” (u)) a y or (W"(u)) pay). 


Partitioning the registers, and symmetries In the remainder of this section as well as Appendices A.4 
and A.5, we will partition the six registers AA’ A” BB’B” into two “parties” in two different ways. The first 
is to consider AA’ together to form a single party, and BA” to form the second party. The second is to 
consider BB’ to form the first party, and AB” to form the second party. Thanks to the symmetry of the Pauli 
basis test under exchanging the two players, as well as the symmetry of the ancilla states in registers A’ A” 
and B/B”, every “bipartite” consistency relation we derive between two operators M, N will hold for both 
of the two bipartitioning schemes, and with the first and second party interchanged. That is, whenever we 
derive a bipartite relation of the form 
Maa’ ~s Ngan, 
the same steps, with the registers appropriately changed, will also yield 


Naa’ ~ Mga”, 
Mpp’ ~s Nag”, 
Ngg' S5 Mapr. 
Similarly, whenever we derive a single-party relation of the form Maar ~ Nyaa’, the same steps with 


the registers appropriately changed will also yield the same relation on BA”, BB’ and AB”. We refer to 
these relations as symmetric equivalents. 


183 


Lemma A.9. There exists a function danticom(€) = poly (£) such that the following holds. 
1. (Self-consistency) For all W € {X, Z}, on average over r € F} and u € Fẹ, we have 


a rons ) AAS aera 


A’ ~Ne a A” . 


2. (Approximate commutation) On average over uniformly random w = (ux, uz, "x,"z), we have 


~ (POINT,X ),ux,rx 4%(POINT,Z),Uz,r 
(xa! ),ux x 4! )uz 2) 


p ENTZ) uzrz yy (POINT,X) tex 
b! 


AA’ ~Sayricom ( b' b Jaa’ È 


Furthermore, all symmetric equivalents of these approximations also hold. 


Proof. Item 1, the self-consistency of { y (onan) eh. follows from the fact that the original points mea- 


surements fir OEA are approximately self-consistent between registers A and B in |ý} (item 1 in 


Lemma A.7), the Pauli projections {T} (indm(u))} are perfectly self-consistent between registers A’ and 
A”, and Fact 5.24. 

We now establish the approximate commutation relations. On average over uniformly random w = 
(ux, Uz, rx, rz), we have 


a~ (POINT,X ),ux,rx 4%(POINT,Z),Uz,r 
Ml a Nga eRe 


= a En (-1)'X"*(ux)) (1 + (-1)"2" (uz)) 


= (1 + (=1)°R"* (ux) + (—1)” "z (uz) + (—1)P + Rx (ux)Ż"z(uz)) , 


where the first equality follows from (187). Note that 


R™ (ux)2" (uz) 
= X” (ux) Z” (uz) ® T* (rx < indm(ux))t” (rz: indm(uz)) 
Rg ZZ (uz) X"* (ux) ® t“ (rz . indm(uz))t* (rx -indm(ux)) 
= 2" (uz) X™ (ux), 
where the first equality follows from (184) and for the second line we used the approximate (anti-)commutation 
relation (174) for the X’X (ux) and Z"2(uz) observables, as well as the exact (anti-)commutation relations 


for the T (rz - ind, (uz)) and T* (rx - indin(ux)) observables. This shows the desired approximate com- 
mutation relation, with danticom = V£. gO 


Lemma A.10. There exists a function ôLmwel£) = poly(e) such that the following holds. For all W € 
{X, Z} and lines £ there exists a projective measurement ees fedeg,, ,(¢) acting on registers AA' 


or BA”, such that: 


1. (Self-consistency) For all W € {X,Z} andr € F}, on average over a uniformly random pair (£, u) 
drawn from the line-point distribution (Definition 7.6), 


m (LINE,W),£ a 
(My a Jaa! ore (M7 )Bar - 


184 


2. (Consistency with points measurements I) On average over a uniformly random pair (£, u) drawn 
from the line-point distribution, 


“-(LINE,W),£ “-(LINE,W),£ ^ (P WW), 
(AE ai Pace Np xe OM ag 


3. (Consistency with points measurements IT) On average over a uniformly random pair (£, u) drawn 
from the line-point distribution, 


~-(LINE,W),£ 
( ( 


a(P W), 
eval, (-)=a]) AA’ Sune aves OINT, Jay 


BA” . 


Furthermore, all symmetric equivalents of these approximations also hold. 
Proof. Define the operator 


Nine ).e = ye Q ie , 


where the Pauli operator a acting on register A’ is obtained by measuring all M qudits in the basis W to 


obtain an outcome h € F“ and then returning the restriction f” of gp : FE; — F; to the line ¢. Here, gpn is 
the degree-d low-degree extension of h defined in (12). 
Item 1, the self-consistency of aa follows from the fact that the original lines measure- 


ment a =] is self-consistent between registers A and B in |) (Item 1 in Lemma A.7), the Pauli 


projections fry are perfectly self-consistent on between registers A’ and A”, and Fact 5.24. 


Item 2, the consistency with the points measurements, follows because on average over (4, u) drawn 
from the line-point distribution we have 


(Mrs (188) 
= E MPT rt aw (189) 
fl f"€deg,,4(£): 
fi +f’ =f 
= (Linz,W),£ we (LINE,W),£ 
= F ok 0) (Mp, Q Tgn ) f (Mieval,(-)=f"(u))AA’ (190) 
uf Sind : 
fi+fl=f 
LINE,W),£ ; P W), 
are m 3 o (MẸ INE,W) Q ane S Cae LN (191) 
J ie eS : 
PAP- 
LinE,W),£ 5 PoINT,W), ; 
a 3 E C Oa N Ma Oh yaa (192) 
iB We CL : 
F= 
= E MP ow we] D MEME e rw”) n (193) 
ff" Edeging(l): a’ al’: 
f'+f"=f a +a" =f! u)+f" (u) 
= (MY as 8 (Ma a i (194) 


185 


Equation (190) follows from the projectivity of {M Do Wh p? Equation (191) follows from the consistency 


between the line and point measurements (Item 2 in Lemma A.7) peeter with Fact 5.20. Equation (192) 
follows from the exact consistency between the {7)"} and {TW n( ai measurements on the A’ and A” 


registers. Equation (193) follows because the consistency of the r OE force a’ = f'(u) and 
"= f"(u) . This establishes Item 2, with approximation error €. 

To show Item 3 we would like to first apply Fact 5.24 to Equation (194), to sum over all functions f that 
evaluate to the same value at a given point u. Unfortunately, we cannot use the Fact as written, since the 
bound in Item 2 is in terms of ~, not ~, and moreover the right-hand side does not have the form I & B. 
Instead, we show the desired bound by direct calculation from Equation (194) 


e > BL (He? @ (1 Meo Pe) (195) 
f 
— ED (GMp D (I _ i |p) (196) 
f 
= Ey. 3 (pM Ne g (I — oN |p) (197) 
we f:f(u)=a 
= E L (PM aca 8 EMO) py, (198) 
a 
where in going to the second line we have used the projectivity of the measurements l DINE; as and 
OO e "\. Thus, we have deduced that 
~ (LINE,W),l ~ (LINE,W),2 ~ (POINT,W), 
(Mian aaa? Se (Merano yaaan 8 (MPT pan. (199) 
Since both the {Mt SE: g! and {v$ EN "} measurements are projective, we also get 
= (LINE,W),£ 
I= L Mieval C )=a a] AA! 8 Ipa” 
(LINE,W), ^r(LINE,W),£ ~(P ,W), 
=L (( Moves). MAA 8 Ipa” — C AN 8 (MA, gi aan) 
a 
+ Meran aan (ME an. (200) 


For projective sub-measurements {A;} and {B;} and a state |g), 


Lill (Ai— Bi) lp) =} (plA:(4: — Bi)lp) +} (pl (4: — Bi) Bil) 
< (iel4zie))” (LOA - B19) 
+ (Ligl(4i- Ble) (Ello) 


<2(Elpl(4i- Bp) 


1/2 


186 


Applying this fact to the projective sub-measurements (Oa Jaata and {(M Meo a Aa’ 8 


Coi a Apar} i in (200) and using (199) gives 
Tye L Merani eaan @ (MO ga (201) 


To go from (201) to Item 3 of the lemma we use another fact about projective measurements. For families 
of projective measurements {A*} and {B*} acting on separate registers of a bipartite state |p), suppose 
IXs}; A; Q BY. Then 


ô> E -Pareso 
FC K (@l(Lareay) | p) —20(g14F © Big) 
= - E} (9147 8 Big) , 


where we used the projectivity in going from the second to the third line, to remove the square. Thus 
EJ ($147 8 Bi|p) <6 
* dj 
and A; ® I ~s I & B7. By Item 1 of Fact 5.18, this in turn mnplies that A; ® I +5 I @ B¥. Applying this 
fact to the projective measurements Chay ie a) aa’ }q and {(M, (Pow, W), iy itn CRU eves 


POINT,W),u 


~ (LINE, W),4 ( 
( [eval,, (-)= Zaja A' je (Ma “pal 
Taking the function Line in the lemma to be y£, this yields the conclusion of Item 3 of the lemma. O 


A.4 Combining the X and Z measurements 


In this section our goal is to create a set of combined measurements that simultaneously measure the 
approximately-commuting measurements constructed in the previous section. We do this first for the points. 
Recall that Ha and Hg denote the Hilbert spaces corresponding to registers A and B, respectively. 


Lemma A.11. There exists a function 6Q(€) = poly (e) such that the following holds. For every x,z € F} 
and H € {Ha, Hg} there are projective measurements {Q7 i aber, acting on H Q (C1)2M such that: 


1. (Self-consistency) On average over uniformly random x,z € FP 
(Ozo Jaa! Sdo (QF; )pa"- (202) 


2. (Consistency with M) On average over uniformly random x,z € F”, 


(O27) ax Ra On NE ay ; (203) 
(QTR nal Sio ee TO S gan (204) 


187 


Furthermore, all symmetric equivalents of these approximations also hold. 


The construction of the measurements Os whose existence is guaranteed in the lemma proceeds 
in three steps. In the first step, for every (r,s) € F? we combine the binary-outcome measurements 


~ (POINT,X ~ (POINT,X ee ; ; ; ; 
mt CNET and mi ae Z$; this is made possible by their approximate commutation as shown in 


Lemma A.9. In a second step, we show that the measurements thus constructed have a particular “ap- 
proximate linearity” property, when seen as a function of the pair (r,s). This allows us, in the third step, 
to combine them into a single measurement {Q77} with outcomes in F; = F% by applying the following 
result from [NV17]: 


Theorem A.12 (Quantum linearity test). Let 0 < 6 < 1, let t be a positive integer, and let |) be a state in 
a Hilbert space Ha ® Hpg. Let {0"} be a set of observables indexed by u € F} that act on Ha. Suppose 
that on average over u, u’ chosen uniformly from F$, we have the following “approximate linearity” holds: 


(0"0")a Ny (OMe), 


where the ~ statement is taken with respect to the state |). Then there exists an extended state |Y") a a'g = 
|) ap 8 |0) 4’ in an extended space Ha ® Hg ® Hy and observables {L"} acting on Ha Q Hy such 
that for all u,u' € F$, 


L!L! =L"! and (L") ar = (OA. 
Furthermore, the observable L" = I when u = 0. 
We now give the proof of Lemma A.11. 


Proof of Lemma A.11. For convenience in the proof we slightly abuse the notation ~5 to mean ~% for 
some polynomial function ô = poly(e) that may differ each time the notation is used. 
First step: construction of projective binary measurements {R“,,}. For every w = (x,z,r,s) and 


a,b € Fo, define a measurement operator 


poly (e) 


A ^ (POINT,Z),z,S ^ (POINT,X),x,r ^ (POINT,Z),z,S 
w, = METAS, l OE o ORAE, 


It is clear that {RY,} a,b€F, 18s a POVM. Observe that on average over uniformly random w, 


“~-(POINT,Z),Z,S “~,-(POINT,X),X,r 
( Jas xy ) 


Reo Fy ÒANTICOM b 7 (205) 


~ (POINT,Z),z, 


which follows from the approximate commutation relation of Lemma A.9 and the projectivity of {M}; 
together with Fact 5.19. Next we show that {Re} is approximately self-consistent: we aim to show that on 


average over w, it holds that (RY) 4,’ ~s (RY) ga” for some polynomial function 6 = poly(e). To show 


188 


} 


this, we perform the following calculation: (recall the state |) defined in (183)) 
E Yi (P| (Re,) aa’ 8 (Re) par?) 
a,b 
O 2), ~ (POINT,X), 
ys ED PIM Ppap e a Raah) 


ANTICOM WwW 


~y a EFO (POINT,Z),z, O aaa T Q (A TEN, rons) ani) 


ANTICOM W ab 
A a (P PAEA a (P WL) ,Zy P X), 
~ ye ELGI OINT, VZS) a @ (MI OINT,Z),z,8 Mt OINT, an art) 
a,b 
A ~ (P WAFA P Z); 
= EP (PIME TO) ga @ (METOE) paG) 


= E PGI OO an lB) 
=1. 


The approximation in (206) is derived by bounding the magnitude of the difference: 


Egoe (p|(R Bi C LA, ag S Riaad) 


ii \Z),2Z, POINT,X),x 


s Wl 


IA 


2 lÊ) ` ELIR paarl) 


© ab 


E) (pl ( (Re, =M 
© ab 


1/2 
>y ÔANTICOM ` v1 , 


A 


(206) 


(207) 


(208) 


(209) 


where the second line follows from Cauchy-Schwarz, and the third line follows from (205) and the fact 
that Lao RL < I. The approximation in (207) follows from a similar calculation. The approximation 


in (208) is derived by bounding the magnitude of the difference: 


E ri Saal Ei: _ MO) 2 Ga . Fi ae a] 


cm 


a 


(BI( apron)" Q V a *) ; (Lu _ PONT), “) @ LPOINTX) x, r)i) 
b 


POINT,X),x, 


“ ~-(POINT,X),X, ( A 
EJ (IU - Ma TaS (Ma nal 
ED (#l( _ Money, i Q re, p) 


= 1- EDL (GIMET y B (Mg gulf) 


= 3B LONO) y 8 Isar = Tans B (METO ar) 
a 


189 


(210) 


(211) 


(212) 


(213) 


(214) 


(215) 


Here Equation (211) follows from the Cauchy-Schwarz inequality, the projectivity of al and the 


fact that } ORES © ae “S < I. Equations (212) to (214) follow from projectivity and com- 
pleteness of Fy aman ae Equation (215) follows from the self-consistency of V a X) aor established 
in Lemma A.9. 

The approximation in (209) follows from the self-consistency of Meola again from Lemma A.9, 
together with projectivity and Fact 5.18. 


This shows that (RY) aa ~s (R®,) pa”, as desired. Combined with Fact 5.18, we have 


(RY) aa! Xs (RY par . (216) 


Finally we apply the orthonormalization Lemma (Lemma A.6) to {Re p} in order to obtain projective 
measurements {L“,, } such that on average over w, 


Le eg RO (217) 
Second step: approximate linearity relations for fre }. We proceed in three steps. First we establish 
linearity relations for OS cma caag For all W € {X, Z}, u € FU r,r' € IF, and c € Fy, 
3 A a APO oer _ D rons : We EN 
b,b' CF»: a,a' CF: 
b+b'=c tr(ar+a'r')=c 


^r(POINT,W), 
Ml 


a:tr(a(r+r’))=c 
= FY ee urr ; (218) 
where the second line follows from projectivity of { y ONY. 


Second we show approximate linearity of the {Rh On average over x,z € Fy and r,r’,s,s’ € Fy, 


peers Ren is 
‘Ee a,b a’,b! 


ri 
a,b,a',b': i 


m D (Rar) e la), (219) 


b' a’ 


x5 D T a f yee) Me 2 Cee aaa) E (220) 


K 
mM 
TT 
= 


P Z ^(P Z ; a (P X 5 
OINT, ag S (my OINT,Z),Z,8 MA! OINT, a (221) 
AA’ BA” 


x5 D Ce) S [aa A , apane] l 222) 
b AA’ b BA” 


190 


In each approximation, the answer summation is over c,d € Fp. Each approximation follows by ap- 
plying a previously derived approximation to part of the expression, together with Fact 5.19. Specifi- 
cally, line (219) uses the self-consistency of eon shown in (216). Line (220) follows from using (205) 


twice. Line (221) follows from the self-consistency and linearity properties of {mf Pom Ne sa shown in 


Lemma A.9 and (218) respectively. Line (222) follows from the approximate commutativity relation shown 
in Lemma A.9. Continuing, 


(222) %5 err ; een E (223) 
PS e 224 

ô ( cd BA” ( ) 

Z oe) : 225 

5 (RY 7 (225) 


Line (223) follows from self-consistency and linearity properties of eat ae Line (224) follows 


from (205) and Lemma A.9, and line (225) follows from the self-consistency of {Re ot from Eq. (216). 


Third we deduce approximate linearity of the {ie}. On average over x,z € F”, and r,r',s,s" € F}, 
we have 


XZ r,S #X,Z,1',8 a X,Z,r,S $ x, zrs! 
3 e at Jaami 3 ki Ja ® (4 ) 


EP A x,zr+r',s+s' 
ws (Regret) 
AA 


~5 ae) A (226) 
f AA 

The first two approximations follow from the closeness of the {RY ’,} and {fe ’,} guaranteed by (217) as 

well as the self-consistency of {RY at shown in (216). The third approximation follows from the linearity 

properties of {RY,} i in (222). 

Third step: construction of {Q7F}. For each w define an observable OY = Laer U The 
self-consistency of O“ between registers AA’ and BA” follows from the self-consistency of the LEY, 
which is obtained by combining (216) and (217). We now verify approximate linearity. On average over 
x,z € F;', and rr ,5)8 € Fy, 


OXZ Ovens! = (= eer ua hi Z,r,S i „3! 
a,b,a',b' 
Pe c+e' 3 x,z,rt+r's+s! 
5 ye) Lee 
ce 


_ Oxzrtr sts! 


where the approximation follows from the approximate linearity properties of {ie nt shown in (226). Iden- 
tify the field F; with the vector space IF, and thus we treat pairs (r,s) as vectors in Fe, For every x,z € FE 


191 


we can apply Theorem A.12 to the collection of observables {O**"*},,se, to obtain the existence of “ex- 
actly linear” observables {£*””"*} that are d-close to {O*””>}, on average over the choice of x, z, r,s. By 
assuming that |) has sufficiently many ancilla zero qubits, we do not need to extend the state further. 

We are ready to define the measurements u. whose existence is claimed in the statement of the 
lemma. For all x,z € EF and a,b € F}, define 


Ow — p(—1)E 0th) pxzrs 
ab : 


r,s 


This operator is a projection (and thus positive semidefinite): 


(Oey -_ —1) tra (s s')b) pxzr,s [EZTS 
á r,s,r',s! 
= E (—1)ECtr at (s+s’)b) pxzrtr' sts 
r,s,r',s! 


= E(— 1)" Utb) pezrs , 


r,S 


The second equality follows from the exact linearity of the {£*"*}. Next, the operators sum to identity: 


3 Or = EY ( (arr cere 
4, r,s 


Thus {Ô*¥7} forms a projective measurement. 
We establish self-consistency: 


(Q3) - = (EC e 


„S AA’ 


es (ra+sb) oe) 


r,s AA! 


(E(- 
x5 (EC yir( (ra+sb) Ore) 
(9: 


r,s 
H : 


The first approximation is due to Lemma A.4. The second approximation is due to Lemma A.4 and the 
self-consistency of the {0%7"S} observables. The third approximation follows from Lemma A.4 again. 


BA” 


Xs 


192 


Finally we show consistency with the ET wee a 


AXZ ny —1 La 
(O23) y ~ CED 7 


1,8 


= = tr(ra+sb)+c+d Pe) 
(E21) 7 


cd 


d 
x = tr(ra+sb)+c a) 
$ (EL ) c,d BA” 


= tr(ra+sb)+c+d y%(POINT,Z),z,S %(POINT,X),x,r 
a (EL CD Ceed M oe 


d 
zZ tr(sb)+d yy (POINT,Z),z a )+c yy (POINT,X),x, 
=(E (— itv HMS *). (£ y (- r(ra eM ”) 


d c 
— __1)tr(sb)+tr(sb") yy (POINT,Z),z 
= EE Ee) 


_ C ; MERA) 


; ; E aoe) 


z BA” 


BA” 


BA” . 


The second line follows from the definition of {O**"*}. The third line follows from the closeness of {îe} 
and {Re ahs The fourth line follows from (205). The fifth line follows from (186), and the last line from 
Lemma 3.25. 

The above calculation establishes Equation (204) of the Lemma. To obtain Equation (203), which is the 
same consistency relation with X and Z swapped, we may use Lemma A.9 immediately after the fourth line 
to swap the two operators in the product, and then proceed as above. 

An analogous derivation shows the same properties of {QF } with the tensor factors switched to BB’ 


and AB”, respectively. This concludes the proof of the lemma. 
O 


Lemma A.13. There exists a function dp(e,m,d,q) = poly(e,md/q) such that the following holds. For 
H € {Ha, Hp} and for every pair of lines lx and £z there exists a POVM uo z} acting on H & (C1)8M 


with answers consisting of pairs of polynomials fx, fz in deg„g(£x) x oe i z) such that on average 
over pairs (lx, x) and (£z,Z) independently sampled from the line-point distribution, 


Ly bz aS AX ,Z 
( eval. Sevan) Aw —6p (Q35) Ba", 


as well as all symmetric equivalents of this approximation. Moreover, if lw is axis-parallel, then fw € 


deg, (£w). 


Proof. We construct the required measurements by combining the measurements A fr W € 


{X, Z}. For convenience in the proof we slightly abuse the notation +; to mean ~ 
mial function 6 = poly (e) that may differ each time the notation is used. 


poly(e) for some polyno- 


We first argue that the measurements {Q*;} obtained in Lemma A. 11 are consistent with the {MA ey) Awy, 


193 


Applying Fact 5.19 to (204), we get that on average over uniformly random x, z, 


A ~,(POINT,Z), ~,(POINT,Z), Ax, 
(O55) aay CP) = (PE) (085) 
i AA’ BA” BA” r AA’ 
a ~-(POINT,Z),Z y%¢(POINT,X),x 
xs (Mi, Mf , 
BA 


Since {Ô%7%ž } is a projective measurement, applying Lemma 5.21 with (Ô*%) aa’ Playing the role of Aj and 
(Om igs aaron) Z) 3a” playing the role of B*, we obtain that 


a,b b 
a ; Me) 227 
(2: £ M ~ D (Ô; A ( : BA” ee 
We now show that on average over uniformly random x, z, 


( 55°) Ne io aoe (228) 


BA” 


This follows from (227) and the following calculation: 


EL lr (oF NA ne) _ (MPE) i) 


z? — Jaar (Qa) na T (EPMO), 10) 
(mp?) aar (Oz 
= EY |r) (C) 
( 


(0% dar (R ; mre) a) |p) 


|I 
Her] 


_ Ci Z) Z 9% ye E o 13) 


| 2 


vee Z)z = I, the third and fourth lines uses the projectivity of 


< I, and the last line uses Equation (204). This shows (228). 


where the second line uses the property £, M 


{4l one: “}, the fifth line uses i 2) 
A similar argument shows that 


AXZ e) , 229 
(9; le , bq ( 4 BA” Co) 
Next, from Lemma A.10 we get that on average over (£x, x) and (¢z,z) sampled independently from the 


line-point distribution (Definition 7.6), 


( 7 ae 5 (M PoE X),x 
BA" ô 


[eval,.(-)=a] )Ba” 7 (230) 
( (LInE,Z),£z 


a 
Mia e Pe rep (MO) 8 ar (231) 


194 


where we used both the self-consistency of the line measurements (item 1 in Lemma A.10) as well as their 
consistency with the points measurements (item 3). To deduce both of these relations, we used Fact 5.18 to 
convert between and ~ distance, and Fact 5.24 to sum over outcomes to the lines measurement. Since 


the {nen and {Q*"} are projective, together with (228) and (229) by Fact 5.18 we get 
ayz PS ~ (LINE,X),£x ( 33”) ( ~ (LINE,Z),£z ) 
( i Jai ôo Maai par and Q; age ta Mievale(-)= Hl ] par (232) 


For all lines £x, lz and (fx, fz) € deg,,7(¢x) x deg,a(£lz) define the operator 


by bz (LINE,X), £x — y%7(LINE,Z),£z y (LINE,X),£x 
Tu = Mp u My 


Then the collection a A forms a valid POVM. Moreover, by the fact that {MA ne) w7 comes from 
a valid strategy for the n -degree test, it holds that fy € deg ,(€w) whenver fy is e apail and thus 
property holds for T as well. 

It only remains to bound the consistency of Er Z} with o 7}. Using the following choices of 
measurements, 


xn. EX) bx 


xn. NCLINE2),ez lx lz 
8 "oo fx % 


u AX YYZ , AX,Z “ x n 
An „a2 :Q (G1) 8 Of 7 Jzig TRF f 


a,b’ 


“(G2) 


Then the hypotheses of Lemma 5.27 hold, with ô in the lemma set to dg, and y in the lemma set to md/q 
(this follows from the Schwartz-Zippel lemma applied to the polynomials fx, fz, which are of total degree 
at most md). Applying the lemma, we conclude that 


(Or) aar 8P (C oaea) ga (233) 


where dp = poly (e, md /q) is the Opasting aS given by Lemma 5.27, conditions (42) and (43) in the lemma 
are given by (232) and (202) respectively, and the distribution over (£x, z) are two independently and 
uniformly chosen lines. This is a symmetric equivalent of the consistency relation in the conclusion of 
Lemma. Thus, by applying an identical argument with the appropriate modifications to the registers, all 
symmetric equivalents of this relation hold, establishing the conclusion of the Lemma. 

O 


A.5 Applying the classical low-degree test 


In the previous section we showed how to combine the approximately commuting X and Z basis point and 
line measurements into a joint point measurement and a joint line measurement. In this section we show 
how these measurements can be combined in a single measurement that returns a pair of global polynomials 
encoding the X-basis and Z-basis information, respectively, by applying the lemma that states quantum 
soundness of the classical low-degree test, Theorem 7.8. 

Introduce a “combining” map that takes two polynomials f, g € idegy in (IF,) and returns a new poly- 
nomial combines, € ideg; 2m+2(Fq) defined by 


COMPING At AP) Re) 


EEP eFp cF, ef 


This combining map can also be defined for polynomials restricted to lines. Let ¢ be a line in aa with 
intercept u = (ux, Uz, Uq, up) and slope v = (Vx, Uz, Va, 0g). Let £x and £z be lines in Fy such that for 


195 


any point (x, z, &, P) € £, it holds that x € Zx and z € £z. Then for any two polynomials f € deg, (£x) 
and g € deg,,,,(¢z), we define combines, € deggm+1(£) by 


combines (u +t- v) = (Ue +t- Va)f (ux +t: 0x) + (ug +t-vg)g(uz +t- vz). (234) 


In the following two lemma we define combined point and line measurements according to this definition 
of combine fg. 


Lemma A.14. For all x,z € Fi’ and «, B € Fy and H € {Ha, Hp} there exists a projective measurement 
Lene Jeer, acting on H ® (C1)®™ such that the following holds: 


1. (Self-consistency) On average over uniformly random (x,z,&, B) € en 
O Or” aie (235) 
2. (Consistency with M) On average over uniformly random (x,z,«, B) € Sa 


AX Zi; ~a (P RX ae ( P Z), 
Oaa Sio ( DE a i eT (236) 


Ax, Zt, ~ (P vZ) Z AP X); 
CO Rig (DMPO a E (237) 


Furthermore, all symmetric equivalents of these approximations also hold. 


Proof. Define 


The self-consistency and consistency properties of O } follow from the consistency properties of 
{Q75} established by Lemma A.11, and Fact 5.24. oO 


Lemma A.15. There exists a function Scombine(€,m,4,q) = poly(m?e, md/q) such that the following 
holds. For every line £ in ppe and H € {Ha, Hg} there exists a POVM {Âi} acting on H Q (C1)®™” 
with outcomes f € deg,„q,1(£) such that on average over a uniformly random pair (£, (x, z, &, B)) drawn 
from the line-point distribution over Ee 


( Oi ean =a AA! ti (Oma re (238) 
(Cites ofa) be tae Oe sates (239) 


where the answer summation is over a € F}. Moreover, if ¢ is axis-parallel, then the outcome f € deg,(¢). 


Before proving the lemma we give definitions regarding distributions over lines and points. 


196 


Definition A.16. For any i € {1,...,m} the i-th restricted axis parallel line distribution Day ,nx; is the 
restriction of Dative to pairs (¢, u) where the line £ is along the direction e; (i.e. is of the form £ = (uo, e;). 
Similarly, the i-th restricted diagonal line distribution Dp,1yx; is the restriction of Dpne to pairs (£, u), 
where the line £ = (uo, s, v) is such that x(s) = i, and coordinates 1,...,i— 1 of v are 0. (Recall that x is 
defined in (50).) 


Observe that Daring is a uniform mixture over the distributions DALmeg,i for i € {Lisi ,m}, and 
likewise for Dptine. AS a consequence, any approximation that holds on average over the line-point dis- 
tribution Di je with error ô holds over each of the i-th restricted distributions Daring; and Dping with 
error 2mô. Similarly, any approximation that holds over Dte X Dtme holds over any product of the i-th 
restricted distributions with error 42d. For example, by Lemma A.13 for any t1, tz € {ALINE, DLINE} 
and i,j € {1,...,m}, we have 


an Tixéz Q [= A xX,Z " f = O mô r 240 
Ue Bi i eye fe AA ( (Qe (x), fe(z)) BA |p) (mop) (240) 


Lemma A.17. There exists a distribution D over tuples (£, £x, £z) with the following properties: 
1. The marginal distribution over £ is the same as the marginal of the line-point distribution over E 


2. For any point of the form u = (ux,Uz,a,B) lying on £, it holds that ux € £x and uz € £z. 
Moreover, for u chosen uniformly at random in £ the marginal distribution over (Lx, lz, ux) (resp. 
over (lx, 4z,uz)) is such that the marginal over (Lx, £z) is a mixture of Datinei X Dative, and 
Dorms, X Dorme; for i,j € {1,...,m} and moreover, conditioned on (£x, £z), Ux is uniformly 
random on ły (resp. uz uniformly random on £z). 


Proof. We describe D by giving a procedure to sample from it. We start by sampling (£, u) from the line- 
point distribution over aes Write u = (ux, uz, &, p). There are two cases to consider: the case where 
£ is axis-parallel and the case where £ is diagonal. In each case, we sample lines £x and £z such that the 
tuples (Cx, £z, ux) and (lx, £z, uz) satisfy Property 2 of the lemma. 


1. Suppose £ is an axis-parallel line (v, ej) where v = (Vx, Vz, &o, Bo). Then we sample (vix), (vz iz) € 
Fy x {1,2,...,m} as described below, and set £x = (Vy, eix), lz = (U5, eiz). 
(a) If1 <j < m, then set ix = j and choose iz to be uniformly random in {1,..., m}. 
(b) If m+1 < j < 2m, then set ix to be uniformly random in {1,...,m} and iz = j — m. 
(c) If2m +1 < j < 2m +2, then set ix and iz to both be uniformly random in {1,...,m}. 
(d) Let v% be the canonical representative Le (ox). and likewise let v% be the canonical represen- 


tative Le (0z) (see Definition 7.3 for the definition of the canonical representative of a line). 
Note that vx lies on the line x = (vk, ej, ), and similarly for vz and lz = (v},¢;,). 


We claim that this construction satisfies Property 2 of the lemma. We show this first (x, lz) and ux, 
the other case being symmetric. If we are in case (a), then ux is a uniformly random point on the line 
fx = (vx,e;) and z = (vz,e;,), which is independent from £x. Thus, the marginal distribution 
over (lx, ux) conditioned on j in this case is equal to D ALing, and furthermore even conditioned on 
(¢x,ux), z is distributed according to Dariyei, for a uniformly random iz. Cases (b) and (c) are 
analogous except that in this case even conditioned on jf, ix is uniform. 


197 


2. Suppose £ is a random diagonal line (v, s, w), where v = (vx, vz, a, 6), and where w = (wx, Wz, Y, ô) 
is a direction vector whose first nonzero coordinate is į = x(s) (where x is defined in Equation (50)). 
We sample lx = (v%, Sx, W%) and lz = (v}, sz, wt!) in the following manner: 


(a) If1 <j < m, then wk = wx and wl, = wz. We choose sx and sz such that (sx) = j and 
x(sz) =1. 

(b) Ifm+1 <j < 2m, then wy is arandom diagonal line direction (and sx is chosen accordingly), 
and w4 = wz, with sz such that y(sz) = j — m 

(c) If2m+1 <j < 2m -+ 2 then both wx, and wz are uniformly random, with sx and sz chosen 
accordingly. 

(d) Set v% = Lin (Ox) and v, = Lin (02). 


We claim that this construction satisfies Property 2 of the Lemma. We show this first (£x, £z) and 
ux, the other case being symmetric. If we are in case (a), then ux is a uniformly random point on the 
diagonal line (vx, Sx, Wx) which is equal to £x. Thus, the marginal distribution over (Zx, ux) condi- 
tioned on j in this case is equal to Dorms j- Furthermore, conditioned on (£x, ux), Zz is distributed 
according to DpLiıng1- The other cases can be analyed analogously, with the only difference being 
how the indices i,j of £x and £z distributions are chosen. 


Let D be the distribution over tuples (£, x, éz) induced by this procedure. This distribution D satisfies 
Property 1 in the lemma by construction, and Property 2 as argued above. O 


Proof of Lemma A.15. We construct the measurements OF out of the lines measurements Tey guaranteed 
by Lemma A.13. To do so we first show how to map any line-point pair (£, u) in the expanded space fe 
to two pairs (/x, ux) and (fz, uz) in F7'. We will define O° in terms of pie and show that the resulting 
measurement is consistent with the combined points measurement Q*7’P from Lemma A.14. 

To define (ey we make use of the distribution D from Lemma A.17. Let Dj denote the distribution 


derived from D by conditioning on the first element being £. Then for every line @ in Ee and function 
f € deg (na41) (4), define 


se E TEZ, 


fu fz ” 
(2x2)~Die fy ede ging (lx) fzdeB ng (lz): 
f=combiner,, fz 


where combine s,f, is defined in (234). Note that {Q}} forms a valid POVM for every choice of line £, and 
moreover that combine fx,fz 18 well-defined for the lines £, Lx, &z because of Property 2 of Lemma A.17. 
Observe that by Property 3 of Lemma A.17, whenever £ is axis-parallel, ¿x and £z are axis-parallel also. 
Thus, in this case, by Lemma A.13 it holds that the outcomes fx and fz both have degree d. By definition 
of combine s,f, it holds that the outcome f of {Ôi} has degree d as well. 
We argue the following bound: 


E E r 2 @ ( G= * _ O m- el/4 4 81/4 4 §i/44 81/2 : 
(€Lxz)~D er i SK fete) Mi \ J vne)) 
(241) 


This bound will follow from chaining the following three claims. 


198 


Claim A.18. 


E E Tox Z Q A 
(CLx,lz)~D woe” fx fz © EEE lý) 
Ly lL ^ (P "Z \zZ yx) 
Sa E E D PIT @ MoM HB). (242) 


(Céx,lz)~D (x,2,,B)EL fy Fy 
Proof. Form the difference and apply the Cauchy Schwarz inequality to obtain 


POINT,Z) 


TEX £Z AXZ — py! ; 2 yy (POINT) x x 
| (£, me wane ee fu fz 8 (Oia), fe) fz(z) f(x) )|p) 
1/2 
< Tix ay 2 
san ,z)~D ae iy | fuifz )I H 


ly,l OF ~-(POINT,Z),Z y¥(POINT,X),X) | ay 12 
. ee TD ma lv Tsz ® nae) MRO Mil) JI) ) 


1/2 


The first term above is at most 1. For the second term, writing 
XZ [AXZ ~ (POINT,Z),z „y (POINT,X),x 
War E (Qi g M, Ma ) 


for short we bound it as 


l T OWE apol s E E X $US (Wig) b) 


E 
(Llxtz)~D me (Llx bz) ~D (x,z,&,B)EL ab 


TOL 


< dQ. 


Here the first inequality uses that for any x, z, a, b, D Fasfx(x)=a,fz(z)=b H pa < I. In the equality in the 
before-last line, the expectation is over independent uniform x,z € F”. The equality holds because for a 
line / sampled from the line-point distribution D, a uniformly random point (x, z, œ, P) on £ is distributed 


as a uniformly random point in ae The last inequality is by (204). O 
Claim A.19. 
E E T lz D ENT Z); z y(POINT.X),x ^ 
(llx bz) ~D erga fui fz fz(z z) fx(x) $) 
xa E EIT o MEW O Ip). 43 


LNE (6,0x,Lz)~D (x,2 BEL fy fz 


Proof. We form the difference and apply the Cauchy-Schwarz inequality 


E E TEE Q METZ zr AEX i 
(L£lxkz)~D e a ($l fxfz fz(z) ( fx (x) )Ip) 


1/2 
Eig emer) 


< 


( E 
(Lx Tae e 


lx, l ~ (POINT,X), X5 j 4 
T E Ne du VTE @ ( I~ Mei) Ji) 


199 


The first term above is at most 1. The second term can be bounded as 
E Tex & @ (P 
(Lex lz) ~D (x,z,0,B) Pa l fx fa fx 


= Tx lz Q (I _ EDTA) a4 
ee (xz, a,b) ie $| a ) 1b) 


OINT,X),X\ j ? 
(x 


ne) By 


=1- E E Tex, b7,X Q Monn 344 
(C£x,z)~D cok meer $I |p), (244) 


where we used the shorthand Ti*'2" = Lethon T is and the second equality uses that {Mà (PUDERAI ~) 


is a projective measurement. Using the definition 


lx,lz _ y(LINE,X),£x (LINE,Z),z (LINE,X),€x 
Th a M a -M -M a 


and the fact that {MA oe Wa is projective we get 


(LINE,X),£x vit 
iis (Clx,0z)~D (xz, p) E£ 2 [eval;(-)=a] 


POINT,X 


xip). (245) 
By Property 2 of Lemma A.17 the right-hand side of (245) is a mixture of terms of the form 


aE p LOM oaa 2 M OG, (246) 
XX) ~Dri a 


for t € {ALINE, DLINE} andi € {1,...,m}. Hence by the discussion following Definition A.16 and item 
3 of Lemma A.10 it follows that 
(246) = O(m dine) - 


Claim A.20. 


E E TE g MENTZ 2) By w 1. 247 
(L£x £z) ~D ahan" vl fufz fz(z) I) vm- (5p 1 +6g *+el/4) Ga) 


Proof. At first we proceed similarly to the proof of Claim A.19, bounding the difference using Cauchy- 
Schwarz as 


E E TZ Q — y EonTZ2)z A 
cere 0 Ene n P ff © -Mre P) 


7 aAA 
S (aaf | — 


Lex, ee 


T£ ~-(POINT,Z),Z\ |A 
; E XAZ Q) -M 
en lz) ~D (x,z,,B) a |l T hfe © fz(z) |p) 


py 


200 


The first term is at most 1. The second, as in the proof of Claim A.19, equals 


1— E E $ (ip TEX22 @ MPONTZ)2 |p) 
(Cx £z)~D (x,z,0,B)EL p 
= E E C [T AZZ D p OTEN |p) (248) 
(Exlz)~Dyix Dy j 2€&Z p 
= E E > (PIT a2 (5 =a,eval,(-)=b] 8 My oe |p) (249) 


(ex, €z)~Dyix Dy j nan xElx q 


lx, lz, vA x Lz 
where we use the shorthand T, =) fx T feval,(-)=b] and D}; x D,,j to denote a mixture over all such 


distributions where t € bane DLinE} and i,j € {1,...,m}. Equality holds in (248) because by item 
2 of Lemma A.17 for (£, x, £z) ~ D the marginal distribution on (Zx, £z) is a mixture of the product 
distributions Dating, X Dating and Dorme, X Dpiinej, for i,j € {1,..., m}. Moreover, conditioned on 
(lx, £z), z is a uniformly random point on 4x. The last equality in (249) holds by definition of 


lx, bzzz lx, 
T,* = a= b] 


— Lx ez 
=r fx,levale(-)=b] 


Ly t 
= E Li Trevals(.)=aevak(-)=0 : 
i a 


We next write 
A | 4%-(POINT,X),x 
E CES m(vVõp+ ðo) (xz) pais a 3o 


xely zElZ a,b 


= Lipman Q MOnt) hô) 


A 


where the first approximation follows from the discussion following Definition A.16 applied to the bounds 
(POINT,X), 


POINT,Z),z (POINT,Z 


Mi) Q MONTZ) By 


from Lemma A.17 and Lemma A.11, the second line uses that (m “} is a measurement, and the 
last is by self-consistency of the points measurements (Lemma A.9). O 


Together, Claim A.18, Claim A.19 and Claim A.20 show (241). To complete the proof of (238), observe 
that by Property 2 of Lemma A.17 the right-hand side of Equation (241) is a mixture of terms of the form 
Together, Claim A.18, Claim A.19 and Claim A.20 show eq:qld-combined-lines-consistency. To complete 
the proof of (238), observe that by Property 2 of Lemma A.17 the right-hand side of Equation (241) is a 
mixture of terms of the form 


E E DOTKA aa! @ I ÂoD) 


(Exx) ~ Dy i Ezz) ~ Dij fx, fz 
for t1, t2 € { ALINE, DLINE} and i,j € {1,...,m}. Hence using Equation (240) it follows that 
(241) = O(m°ôp) . 


This establishes the first relation (238). An analogous argument shows the second relation. 


201 


Lemma A.21. There exists a function ôs(e,m,d,q) = a(md)*(e® + g? + 2-4) for some universal 
constants a > 1,0 < b < 1 such that the following holds. For H € {Ha, Hg} there exists a projective 
measurement {Soca acting on H & (C1)®™ with outcomes consisting of pairs (8x, 8z) of polynomials 
each in ideg; (Fy) such that for W € {X, Z}, on average over uniformly random u € F”, 


A A ,W 7 
(2 eval: pea Aa’ a. Cae T (250) 
(S rait ôs (A EA io (251) 


where the notation eval,(-w) means the evaluation of the first outcome gx of s if W = X, or the second 
outcome Qz if W = Z, at the point u. 


Proof. Lemma A.15 establishes the existence of a strategy p = (1p, {04} U TE }) for the classical 


LD 


low-degree test Gig) arams 


for Idparams = (q,2m + 2,d,1), with value 1 — d.ombine- This strategy is not 
necessarily projective because the measurement {Â} is not guaranteed to be; however, we may projectivize 
it by applying Naimark’s theorem to this measurement. This preserves the consistency (œ~) relation obtained 
as the conclusion of Lemma A.15. Theorem 7.8 applied to this projectivized strategy implies the existence 
of a POVM {S.} with outcomes g € ideg42m42(F4) that is 6.-self-consistent and consistent with the 


“points” measurements 6 gas }, where ôn = a(md)" (ôl mbine +9” + 2-2) for some universal con- 
stants a > 1,0 < b < 1 is given by Theorem 7.8. We may apply Naimark’s theorem once again to {Se} to 
obtain a projective measurement with the same consistency guarantees, and henceforth, we will use {8 et to 
refer to this projective measurement. 

We now argue that with high probability, the polynomial g returned by the measurement {So} has the 
form g(x,z,a,B) = agx(x) + Bgz(z) for polynomials gx,gz € ideg; ,,(IF;). Let G denote the set of 
such polynomials. On average over uniformly random (x,z,a, B), 


(Se) aa’ = EDIN 
Sin (sJan ® O ph 


E AN 


a a (P X) xa^ (P way 
Sio (Sedan ® (Tavs) My MOND) (252) 
b,b' 
â 5 PAES Y X) 
ôg (Se) wa’ 8 (x 1 b+ Bb'=¢(x,z,0,B) Vi di ) een ) ‘) an (253) 


Here the first approximation uses the projectivity of {8;}, and the first approximation uses the consistency 
of {S.} with {077P} from Theorem 7.8, together with Fact 5.18 to switch from ~ to œ, and Fact 5.20. 


The second, third and fourth approximations follow from the consistency properties of ion } from 
Lemma A.14 combined with Fact 5.20. Note that we will need both Equation (252) and Equation (253) for 
the subsequent calculations. 


202 


For all g define |ý) = (Sg) aal). Then on average over uniformly random (x, z, a, B), 


2 


~ (P VX), P Z), 
( > lab+gb'=g (x,z,0,B) Mt TA “My ony Pg A) 
b,b' 


P 


(P Z POINT,X),X y%(POINT,Z),Z)_* 
= A T B 3 1yb-+pb'=¢(x, ZX B)\lab+pb"=g( (x,z,&,b) a ™ n ave om MG ee “| pg) 
b, b', b" 


Sig E by asi fia CAILE (POINT,Z),z ay, a eee (254) 


where in the first equality, we used the fact that aca is projective, and in the approximation, we 


used that for fixed x,z,a,B with B A 0, if ab + Bb’ = g(x,z,a,B) and ab + pb” = g(x,z,«, 6), then 
b' = b". The probability that 6 = 0 is 1/q, and the term under the approximation has an absolute value of 
at most 1, so the error incurred is at most 1/4. 

We will now show that the right-hand side of Equation (254) is small for polynomials g that are not of 
the desired form. Fix b,b’ € F}, and let hp y (x, z, &, p) = g(x,z,«, P) — (ab + Bb’). Observe that since g 
has individual degree d, so does hp w. Write 


d 
hy at (XZ; X, B) = D hyv 3,j(%, 2) wp 
i,j=0 


for some polynomials hy y,;;(x,z) of individual degree at most d. Let G’ C ideg,,,,,(IFq) denote the 
polynomials that are linear in w, b, i.e., g = agı (x,z) + Bgo(x,z). Fixa g ¢ G’. Then hyp (x, z, &, B) is 
a nonzero polynomial of total degree at most (2m + 2)d, and in particular there must exist an (i,j) such 
that the polynomial hyw; j(x,z) is not the identically zero polynomial. Call a pair (x,z) € F; good if 
hp p (x,z,&, B) is a nonzero polynomial function of «, 6, and bad otherwise. The probability that (x, z) 
is bad is at most (2m + 2)d/q by the Schwartz-Zippel lemma. Otherwise, conditioned on a good (x, z) 
pair, the probability that hy w (x,z,&, 6) = 0 over the choice of a, ß is at most on + a /q, again by 


the Schwartz-Zippel lemma. Thus for g ¢ G’ we can upper-bound (254) by 25 amtaja ll lfe) |”. Combined 


with (252) we get that 


gg’ 


(eJaarlÊ) l? < Olo + 5g + md/q). (255) 
Next, we argue that not only is g linear in «, 6 with high probability, but also that g = agx + gz 

where gx depends on x only and gz depends on z only. Fix a polynomial g € G’, i.e., g(x,z,&, b) = 

a91(x,z) + Bgo(x,z) for polynomials g1, 82 € ideg jo, (Fq). For all x,z € Fi", a, B,b, b' € F}, 


lab+6b'=agi (x,z)+Bg2(%,2) = Lp, (x,z) Lv'=g2(x,2) T Loto (x,z) Vb’ £g0(x,z) i Lyb+ pb!=agi (x,z) + Beo(x,z) : 


By the Schwartz-Zippel lemma, 


F Logi (x,2) V b'žga(x,z) ` Lab+Bb’=agi (x,2)+Bg0(x2) S 1/4 - (256) 


Let G’ = G% U G4 U G, where Gx (resp. G'z) is the set of polynomials g € G’ where g(x, z) depends 


203 


nontrivially on z (resp. g2(x,z) depends nontrivially on x). Fix g € G5. We can upper-bound (254) by 


+ lI Pe) |? 4: E Uh -. tie Wiest (x2) (Pel MY (POINT,Z), a Oe Oe (257) 


< = lol HEL (Ely-g)) (Bel Myr“) (258) 
b' 


where the indicator 1p=g;(x,z) is removed using positivity in the inequality. To bound the second term 
in (258), we use an argument similar to the one we used to derive (255). That is, say that a value of z is 
good if g(-,Z) is not a constant function. The probability that z is good is at least 1 — (2m + 2)d/q by the 
Schwartz-Zippel lemma and the assumption that g € Gi (and so go depends nontrivially on x). Moreover, 


for each good z and for each b’ the expectation Ex 1p=e5(x,z) ÍS at g (2m + 2)d/q. Hence, we conclude 
that if &2(x, z) depends on x, then (258) is at most aaa Do) * and thus 
Se)aalP) yI? < O(b.p + 6g + md/q). (259) 


SEGz 
By starting with (253) instead (i.e., the order of Myon a and MPONT) j 
perform similar reasoning to deduce that 


are switched), we can 


«)aal®) || < Olo + ôo + md/q). (260) 
SEG 


We thus obtain 


A ay (pe A A A A A ay j2 
gaa < D OSINY g)Aaa' glaa lý) | (261) 
gG gg’ gEGk gEG, 
< O(6.p + dQ + md/q). (262) 


Let dg be the error in the right hand side of Equation (262). Define a projective sub-measurement 
{Sey,g7 } with outcomes gx, gz € ideg, ,, (IF) by 


A 


Sgx8z _ Scombineg,,¢," (263) 


A bound on the incompleteness of $ ex.gz follows from (262) and the projectivity of oy 


i (Bl (Sex.gz) aa lÊ) = ye bl ($ gax lÊ) = 


8X/8Z ggg 


Saa lÊ? < ôg. (264) 


204 


5 ~-(POINT,W 
We now bound the consistency of {S¢,,¢, } with the points measurements {wt Ra) g 


1 = EPIs) Jaa lÊ) (265) 
ig Y (PIS) aal) (266) 
BEG 
P Z P X), 
~ (6g) 1/2 Eg ow Ol (Pl (Sgxigz)aal ® (D latszat x)+6gz(z 2 My es M ii ee al? 
X/8Z 
(267) 
~(69)1/2 a y G Sox on) wad ® ( > lap + Bb’ =agx (x)+Bez(z)=ab"+ Bb!” 
8X82 b,b',b" 
v a pyran X=) a lô) (268) 
Pring daa (ýS Sox gz) AA! 8 (E taso- agx(x)+Bgz(z)’ 
XZ 
Memes poe oe) aul) (269) 
(POINT,X),x 4%,(POINT,Z),Z y*(POINT,X),x A 
1/9 E 2 (HI ( (Sex gzAa Q (ma x(x) “My (2) M(x) ) sal): (270) 


Z gx 18Z 


Here Equation (266) follows from (264). Equation (267) follows by approximation (253) together with the 
Cauchy-Schwarz inequality and the projectivity of $ g: 


~ (POINT,Z),z yy (POINT,X), A 
E L ($I (sxs) O- (D lars pnec jpe MY TOEO l) 
X,Z0B ox 97 BA 


= L (ÊS Tael al 


TA 


(POINT,Z),z EDX) ^ 
( L lab+pb'=agx (x) +Bgz(2) My M; Jax) |$) 


(Serel ae ® (I= 
b,b' 


E 
XZA P ox eg 


<1: ôo. 


Equation (268) follows by a similar calculation. To pass from here to Equation (269), we argue that for 
a # 0, the indicator forces b = b”, and the terms where a = 0 are bounded in absolute value by 1/4. 
Finally, to reach Equation (270), we apply Equation (256) to the expectation of the indicator over «, B. 
Equation (270) tells us that 


A ~ (P X) xX a%q_(P Z) Zz (P X) 
CET EN So E (“l OINT, Ml OINT, 2 Kl OINT, a ee (271) 


This is almost the desired conclusion of the lemma. The only issue is that {8 ex,gz 18 a sub-measurement. 
To address this, we complete ines by adding 1 — } ox g, — to an arbitrary measurement element. By 
Equation (264), it follows that the consistency relation in Equation (271) holds for the completed mea- 
surement with an error increased by dg. Let dg be this new, increased, error, and observe that dg = 
O(ég + (ôg)! + d/q). Substituting in the bound on dg from Equation (262), we obtain that ôs = 


205 


O(d.p + (d9)!/2 + md/q) has the form a! (md)" (et + q7™ + 2-"4) for some universal constants a’ > 
1,0 < b' < 1. We thus conclude by an application of Fact 5.24 that Equation (250) holds for W = X. 

To obtain the same result for W = Z, we perform exactly the same steps starting from Equation (265) 
but with X and Z interchanged (so we use Equation (252) instead of Equation (253)). Finally, entirely 
analogous calculations with the registers swapped yield Equation (251). 

O 


A.6 Pulling the X and Z measurements apart 


Recall the decoding map of the low-degree code defined in Section 3.4. This map takes in a polynomial 
g € ideg, ,,(Fq) and returns a string Dec(g) € Ey consisting of the evaluations of g on the points in the 
hypercube {0,1}. We now use this decoding mar to construct a new set of measurement operators out of 
S. First, we introduce some notation. For all W € {X,Z} and all degree d polynomials g : F; — Fy 
define projectors 


where the measurement {gx gz } is given by Lemma A.21. 
For all W € {X, Zy, ü E FM, anda € F,, define the projector 


oles: yaa (272) 
heEFM :h-ti=a 


where recall that g” is the projector corresponding to measuring (C7)®™ in the W basis and obtaining 
he Fy as the outcome. For all ñ € FY define the projective measurement {MY "Sack, by 


Ma” =} 8y 8 Thectg)-a)—a(A) - (273) 
8 


If a W is viewed as an operator acting on registers AA’ (resp. BB’) then we view M; Will as an operator acting 
on ENES AA‘A" (resp. BB t} r (When necessary we will indicate Nee subscripts which registers the 
operators are acting on.) The MY” are projective because the ov and t operators are projective. 

For all j € {1,...,t} define the observable 


WI (i) Z y(-1 154 ) MWA, ü (274) 


a 


where {e;} j€{1,..,t} denotes the self-dual basis for IF; over Fz specified in Section 3.3.1. Let T” be the 
generalized Pauli observable defined in Section 3.7. Observe that 


( D (D D E) BT = y Pra E gs @ aN 
8X /8Z 8x8z h 
= ELEY gn @ (af) 
Sw a h: h-ŭ=a 
= =} (- je (e;(ñ-Dec(gw) tope Q T” (i) 
gw 4 
= WI (i), 


206 


where the first equality uses the definition of t™ (e;ñ ej ii), the second uses the definition of u and regroups 


terms, the third uses the definition (272) of tW (i), and the last is by a change of variables. These equations 
in particular show that the observable Wi (ii ji is Hermitian and squares to the identity, because the measure- 
ment {$ ex,gz + ÍS projective. Furthermore, it can be verified by direct calculation that the observables Wi (i) 
satisfy the same relations as the generalized Pauli group. Specifically, for all j,j’ € {1,...,t}, 7,6 € FM, 


EPER Z! (6) XI (ü ES 
T a 

(—1) 
The next lemma shows that the measurements {my at and observables WI (ñ) are consistent with the 
original points measurements, and are self-consistent. 
Lemma A.22. Let ôs as in Lemma A.21. For all W € {X,Z}, all i € F} andj € {1,...,t}, the 
measurements {MY "ck, and observables Wi (ii) satisfy the following properties: 


1. (Consistency with points measurements) For all W € {X, Z}, forall j € {1,...,t}, and on average 
over uniformly random u € FP 


W, ind, POINT,W), 
lixa 8 Ma ne) ~6, M! m ® Igg’'p” 


and 


~ W,ind, POINT,W), 
M! in m (U) Q Ipp'p” Xss Ty atat Q Mí )u 


2. (Self-consistency) On average over uniformly random ti € EY, 
Wi (i) Q Ipg'p” 55 Isaa" Q Wi (a) x 


Proof. We show the first item of the lemma for the case when Fe ceca acts on register A and M; 
acts on registers BB/B” (the argument for when the A and B registers are interchanged proceeds analo- 


gously): 


W, ind, (u) 


E TEGIME m D MW delu) By 


ucky a 


,W),u a i r 
= EB Pe e i r ® a idm (t)) pr) 
AIAN ,W),u 
= E Po a uas" ® (EPa) 
8 
W), 
= Ey LAV (ÑM Mo AB” 8 S4 Jeg lÊ) 


>1-—ôs. 


The second line follows from expanding the definition of MY indm(u) using (273), the third line follows 


from the definition of Ma PO given in Section A.3, the fourth line follows from the fact that Dec(g) - 
indm(u) = g(u), and the fifth line follows from Lemma A.21. This implies the first item of the Lemma. 


207 


To show the second item we first argue that, on average over i, 


( wi) Reg (aw) oe (275) 
AA‘A BB'B 


This follows because on average over u € Fy and ñ € Fy we have 


(Mn) rar (276) 
as (MA) re (ESA K O (Mioa ear) (277) 
= 3 (Paw ® (t{ ee(g).a)-a(i))4”) . ( L (MPO) Q (P Gndm(u)))ar) (278) 

ren 
= (SP aa ® (TH) an @ (Moree (279) 


The first inequality follows from right-multiplying (276) by the approximation (297) in Lenan ae 23 (to be 
proved below) and using Fact 5.19. Line oo follows from apare the definition of MY . To obtain 
Line (279), we expanded the definitions of Teel) a-a i) and T (indm(u)) using Eq. (272), and the fact 


that h- ind,,(u) = gn (u) (due to Eq. (13)). Thus for all g € ideg; „(Fq) and s € F} we have 


eigi A) (indm(u)) = 2 W., 
(Dec(g)+h)-ñ=a 
gn(u)=s 


ONES 


Using the self-consistency of { M} shown in Lemma A.7, we then obtain 


ame D GPa (MEIC) (MET), 
(Dec(g)—h)-ii=a 
~o EPan (Me yay JA 8 a @ (ar) (Meca Ja 8D 
T as 
= - (SP yaar (aa 8 ar) ; (Meee De . (282) 
g, i 


The “so” in Line (281) indicates that the left hand side is equal to the right hand side when applied to the 
state iB). Line (281) follows from the self-consistency of the operators {z }, and Line (282) follows from 


the definition of aaa al Using the fact that |) = Ep (1 )g' D (Th) pv |b), we get that 


(282) Xo D r . M a 8 War) i ani ® 3 © Ty w) 


gh: s BB’B” 
(Dec(g)—h) -fi=a 
(283) 
ê ~ (P ,W), a (P W), 
= 2 (C4 i M aT ® ar) f (O n ® (W Ja) . (284) 


208 


Line (284) follows from (283) by expanding the definition of TO. cle 


average over u € F”, 


(284), E (San a) @ (MEN ty) pa ® (Ty Jar) 285) 
ghh: 


(Dec(g)—h)-fi=a 


9 Next, we argue that, on 


We show this by bounding the magnitude of the difference. Using the fact that oY and cals are projective, 
we have: 


ê z ,W),u 4 ,W),u ay ||2 
Saar @ a @ (MS re Nay oar @ (tH dar || 


i ~(P WW), A a (P W), 
=E D (CE Me SE U- M Dan (ar) 
8 
A ,W),u 2 W A 
. (caro 24, @ (oY) ) i) 
g—gn +g )(u)/ BB h! /B 
i ^ (P YW), A ~a (P W), A 
SELPI- My) SL M aal) (286) 
& 

a (287) 
~ (POINT,W) u \9 
M ) Q 
(8—8nr+8p ) (u) BB’ 
(a )g~ is at most I. The inequality in Line (287) follows from the second item of Lemma A.23 (to be 

proved below). Next, we show that on average over u € Fg 


The inequality in Line (286) follows from the fact that for all g, h, the sum Yy ( 


â A F VW) ,u 
285)= Eo (Eae Caan) (SM mare aay (TH Daw} (288) 
gh gh: 
(Dec(g)—h)-fi=a 
a A ^a ({P VW) ,u 
~0(6s+Vé) L (a 8 (T)ar) i E i Me Jer 8 (W Je) 
g,h,g' W: 
(s'—8n)(u)=(8—8n) (4) 
(Dec(g)—h)-fi=a 
(289) 
~is (EPan @ (ar) (SE) np 8 (er) (290) 
g,h,g' hW: 
(s'—8n)(u)=(8—81) (4) 
(Dec(g)—h)-fi=a 
Smaa E (Paw o ar) (Epee @ (HBr) - (291) 
gh,g',h' 
8'—81 =8—8n 


(Dec(g)—h)-ii=a 
Before we justify the sequence of steps to go from Line (285) to Line (291), we first argue how this estab- 
lishes Line (275), and hence the second item of the Lemma statement. 
Absorbing the error terms O(md/q) and O(,/é) into ôs, we have shown that 


(MN) Xs ty (EPa ® ar) ' (6 )ee 8 (i! Jnr) 
8'— 8y =8— 8h 
(Dec(g)—h)-ti=a 


209 


Note that the conditions (Dec(g) — h) - ñ = a and g — g) = g’ — gy are equivalent to the conditions 

(Dec(9’) — h') - ñ =a and g — gy = 8 — gy. Thus the expression in (291) is symmetric between g, h and 

g',h', and so an analogous derivation shows that (m ”) is ds-close to (291), and thus ôs-close to 
B 


B'B” 
(me =) This shows (275). Since {M!"""} is projective, by using Item 2 of Fact 5.18, followed by 
AA‘A n` 
Fact 5.24, and then followed by Item 1 of Fact 5.18, we get for all j € {1,...,t}, on average over fi € Fy 


(Mine, fs = pj) Aana” 5s (Mic) 0] BBB" 


where the answer summation is over b € Fp. Finally, the second item of the Lemma follows from apply- 
ing Lemma A.5. 

We now return to justifying the sequence of steps between Lines (285) and (291). The equality in 
Line (288) follows by left-multiplying (285) by Igg' = Lig (St als 


The approximation in Line (289). The approximation in Line (289) follows by bounding the magnitude 
of the difference: 


GW W AW 7%(POINT,W),u 
E) | 2 ((Sf Jaa Q (Ty Jar) i Ņ Magte aj BB! 8 (ty) B”) Al 
a g, Ta l, 
(3’— Sy) (4) A(8—8n) (u) 
(Dec(g)—h)-ii=a 
_ “1 (AW Ww i ~ (POINT,W),u êW, 
= ty (s; Jaa’ ® (Th Jar) CCNA Se 
(8’—8w)(u) A(8—8n) (u) 


A P „W „u - 
U ee pp’ & Garg) |$) 


<E E PSE an @ (ap Dar) © (AGON. SF MGO a (T)ar LB) 


ghg ha 
a'#g'(u 
— F O hia a ; sw. V Jep lĝ) (292) 
ga:azg(u) 
“ ~ (P ,W),u 
Soa EEI — My) apr ® (Sf) wl) (293) 
& 
< ds. (294) 


The approximation in Line (293) follows from the following calculation: 


|293) z (292)| 


< |F E To (GME : Spe ; Ce ma Weer z C Ae nl) (295) 
ga: a 
A a VW) ,u ^ WW), A ^ WW) ,u “a 
+E E OAE a (METO an): (Eaa (ME) arl) 
" ga:a#g(u) 
(296) 


210 


Using Cauchy-Schwarz, we can bound the first term on the right hand side of the inequality by 


1/2 
(; D (pn ; ew. M AN wi ; 


" o a:a#g(u) 


(x L (PAPON ag — (MOTO n) (Se) pa 
gaa#g(u) 


1/2 
(MPOT agr = (MP oy na) 


<V1-Ve 


The inequality follows from the self-consistency of MI} , as established in Lemma A.9. Similarly, 
we can bound the second term in Line (296) by \/e. This sstailishes the approximation in Line (293). 

The inequality in Line (294) follows from Equation (251) of Lemma A.21. This concludes the derivation 
of the approximation in Line (289). 


(POINT,W),u 


The approximation in Line (290). We establish this approximation by bounding the magnitude of the 
difference: 


a PoInt,W 2 
FX 3 (Ean 8 (ar) (CP Mg Daw @ (tH Yn) | 
g g,h,g',h': 
(g' Su) (u)=(8—8n) (u) 
(Dec(g)—h)-ii=a 
IÊ ~ (P ,W),u A 
5E D EPan @ Han U- Mpo) SE 
ghg h 
(g 8p) (u) 
=(g—gn)(u 


O- MON) @ (Th) rl) 


< EEPO- MGa) SY T= MA) lB) 


<ds . 
The last inequality follows from the second item of Lemma A.23 (to be proved below). 


The approximation in Line (291). We establish this approximation by bounding the magnitude of the 
difference: 


(g'—8yr) (4) =(8—8n) (u) 
(Dec(g)—h) -ti=a 


= E CEM aa o Gan) (EPer 8 (af dar) 9) B1 — gw) U) = (8 - s) u) 
g,h,g',h': 
8'— 8n F8- 8h 


< md/q. 


211 


The last inequality follows from the Schwartz-Zippel lemma: two distinct polynomials of total degree at 


most md cannot agree on more than md /q fraction of points. 
O 


We now give a proof of the following Lemma, which was used in several derivations in Lemma A.22. 
Lemma A.23. For W € {X, Z}, on average over u € F”, we have that 
ECs (mn Maar (297) 
& 


and 


r= ant 298) 


where the answer summation is over g € ideg, Fa) Furthermore, all symmetric equivalents of these 
approximations also hold. 


Proof. We first prove Equation (297). From Equation (250) of Lemma A.21 we have that on average over 
u € F” 
q > 


MEG awel pari) 2 1- ôs (299) 
which is equivalent to eo, aw Q patel %5; I. By Fact 5.18, we have that 


A a (P ,W), 
DEY aw 8 (ME o Xss I; 


We now prove Equation (298). Equation (299) and Fact 5.18 shows that 


POINT,W),u 


(SFr N= al) AA! “ds (ms )Bar 


where the answer summation is over a € F}. Left-multiplying this expression by (Eua) Aa’ and using 
Fact 5.19 we get 


(Suja AA! Xss (Siu )=a)) aa’ ® aves 


~; T a Oe u PR 


POINT,W),u ) 


BA” 


where the second approximation follows from the following calculation: 


CF pce (LO a) (300) 

~ OINT, OINT, 2 A 
eprint oY a yy (MN) an) TB) G01) 
=EL K o. Pornt), a _ C aa P ani) f (302) 
<e (303) 


212 


The first inequality follows from the fact that S ile ia Š < I for all a, and the second inequality follows from 


the self-consistency of Ms (POINTI); “ (established in Lemma A.9). Thus 
A A ~ (POINT,W), 
(Sigi=a)aar Soas) Spasa MO aw 
On the other hand, notice that 
GW (PoINT,W),u 
EL (sa — momma), ugy 
a PoINT,W), A P w), 
=E L(G- MET). SEL AGT Dal) 
& 
A ~ (POINT,W), A ~ (POINT,W), A 
=E Dhl a My oS "9 l Ŝi u=] ` (Q E My Ve yield) 
a 
<O(ôs + £) 
Equation (298) is thus obtained by absorbing ¢ into the function ds. O 


The next lemma shows that we can construct local unitaries V4, Vg from the measurements Ê such that 
the shared state |p) is close to |AUX) ® |EPR,)°™ for some state |AUX), and the observables W/ (ñ) are 


mapped to Pauli observables tT” (e;i) acting on |[EPR;)®™. Using the consistency between the {MY a sack A 


(POINT,W),u 


measurements and the points measurements {M, Wa A established in Lemma A.22, and the consis- 


PAULI,W 


tency between the points measurements and the “total Pauli measurements” {Mí H eFM established 


in Lemma A.7, we deduce that the total Pauli measurements must be close (up to the local unitaries V4, Vg) 
. . W . M 
to the Pauli projectors {7,,"} nep™ acting on |EPR,)® 


Lemma A.24. There exists a function dg_p(€,m,d,q) = O(6,/4 + md/q) (where ôs is defined in Lemma A.21) 
and unitaries Va acting on registers A A'A" and Vp acting on BB'B" such that 


1. There exists a state |AUX) on registers AA'BB’ such that 


Ya Q Veh) AK argp' 1B” — |AUX) AAB B’ Q |EPR, yom. 


< QLD. 
2. Forall W € {X,Z},h € F} 

Va (mr) Vi Xsan laa’ @ (Ti) a” 

Ve (ae Vi Ysg Ipp © gn 


where the ~ statements hold with respect to the state |AUX) ,a'gp’ Q |EPR, os q) Aig 


Proof. Define the linear map 


Va = Do (Ssxgz)an ® (t*(Dec(gz))t”(Dec(gx))) a”, 
8X/8Z 


213 


where the t*(-),T“(-) act on (C7)®™_ (Also note that the argument of tT* is Dec(¢z) and the argument of 
TŽ is Dec( gx).) We verify the unitarity of Va: 


VAVA = i) (Sexigz)aa’ ® (t* (Dec(gz))t” (Dec(gx))t” (Dee(gx))t™* (Dec(gz))) 4” 


= D (Cores 8 [yn 


where in the first line we used that the measurement {8 ex,gz} 18 projective and in the second line we used 
that the tT“ and T* observables are self-inverse. The unitary Vg is defined analogously. 

Let {ej} j€{1...t} denote the self-dual basis for F} over Fz specified in Section 3.3.1, and let W! be 
the observables introduced in (274). Conjugating the observable W/ acting on the registers AA’A” by the 
unitary Va we get 


Va WI (it) Vi 


= L (—1)"(eDPeclsw)-4)) a 


@ (t*(Dec(gz))t”(Dec(gx))t” (eji) 
8X/8Z 


TŽ (Dec(gx))*t*(Dec(gz))*) ., 


where in the third line we used the (anti-)commutation relation 

T* (Dec(gz))t*(Dec(gx))t™ (ej) = (—1) i(Peelaw)) tW (ett) r* (Dec(gz))t” (Dec(gx)) - 
An entirely analogous calculation shows that Vg WI (ñ) Vit = Igy & = (ejŭ )) 

a B” 

Next, from the self-consistency property of the observables W’ specified in Lemma A.22 we get that for 

We {X,Z}, 
ne es et ee E ` 
E E ($I(Wagr-Iew(a)) (Wa)gI-18W(i)) |p) < ôs, 
je {1,...,t} eFM 

which, since W! is Hermitian, is equivalent to 


E E (IW (a) @W(a@)|p) > 1-65 /2. (304) 
jE {1,..t} ack} 


Let |0) = Va Q Val), and for W € {X, Z} define the operator 


Hw= E E (ef) @t (ea 
je{1,...t} EFM (6 ) (6 


@M 
= E ( E T" (ejs) 8 t" (ejs) ) 
jE{1,..t} `sEF; 


= ( E mege)” 


sek, 


214 


where in the second line we used that for all ñ € F7, TW (0) = @M, t" (a;) with 7” (a;) being the 
generalized Pauli W observable acting on the i-th qudit. In the third line, we used the fact that for every 
fixed j, we have Eser, T™ (ejs) @ tT™ (ejs) = Eser, T™ (s) @ tT™ (s). Equation (304) is then equivalent to 


(0|Hw|0) > 1- ôs/2, 


for W € {X, Z}. 
Observe that 


E t*(s)t7(s!) @t*(s)t7(s') = E $, (1) a + sya] @ (—1)"0 |b + sXb] 


f / 
8,8 5S! ab ER, 


=E $ |a+s)(a| @ |a+s)al 


ack, 
= |EPR,;){EPR,| 


where to go from the first to the second line we used Lemma 3.25. This implies that Hy Hz = ( Es 5'cE, TX (s)t7(s') @ 
@M 
(alt (s")) = |EPR,)(EPR,|°™. By the triangle inequality, we have 
|| Hx|9) — Hz|8)|| < II!) — Hx!9) | + 119) — A218) I - 
Note that for W € {X,Z}, 
l||9) — Hw|9) | = 1 — 2(0|Hw|8) + (8|Hiy|9) < 2—2(8|Hw|8) < ôs, 
where in the inequality we used that Hw < I. Furthermore, 


(8|Hivl9) = Hwe) I? 
= |||) + (Aw — 1)18) || 
> (ID — ||(Hw — 118) 11)? 
> (1-65)? 
>1—2ős. 


Therefore 


2/5 > \|Hx|0) — Hz|8)||° 
= (8|H%|9) + (8|HZ|8) — 2(8|HxHz|8) 
> 2-465 — 2(8|HxHz|8) , 


which implies that 
(8| (La app’ Q (|EPR,)(EPR,|°™) arg) 0) = 1— 265 =A] oc. 
Define the unnormalized state |AUX9) on the registers AA’ BB’ as 


|AUX0) aa’BB! = ((EPR,g|o™) ange -|0) aa’aBB'R” 


215 


and define the normalized state |AUX) = TxtmyT|AUXo)- The inner product between |0) and |AUX) ® 
EPR, can thus be evaluated as 


1 
| |AUXo) || 
1 
= Tam (8|(Ina'pp’ Q (EPR (EPRq|°™) avg) |0) 
1 


= “|| [AUXo) ||? 
I Tauxo) || 


= || |AUXo) || 


> 1-265 — 5s 
> 1- O(vôs) . 


This implies |||) — |AUX) @ |EPR,)°™||? < O(./6s), which establishes the first item of the lemma. 

We now establish the second item of the lemma. In what follows, all ~ and ~ statements hold with 
respect to the state |). From Lemma A.7 we have the following implications: for all W € {X, Z} and on 
average over u € FF7, 


(8|(|AUX) ® |EPRy)°™) = - (8|(|AUXo) ® [EPRq)°™) 


(POINT,W),u 


PAULI,W 
Mp aa] © Igg'g” ~e laaa" B Ma , (305) 
A i a Aa M(POINTW) 1 E lnii (306) 
where M nt w] = Lng, (u T From Lemma A.22 we get that 
me © Ibps” Sss laa'a" Q M (307) 
Using Fact 5.23 with Equations (305), (306), and (307), we get 
PAULLW W,indm 
Mya) ® lag's” Syg lanar @ Mann) , (308) 


where we used that ds > e. 
Next, we have for all u € Fy 


= L (Sex22) 55 2 A 


) 
) 
=) (ea — 


1 
8x8z BB ingn(u)=gw(u)—a 
1 
8x82 BB’ ingn(u)=gw(u)—a 
ser ê W 
= Sgxgz BB’ 8 L ( h Ji 
8X8Z h:g),(u)=a 


=Ipp' @ (T, (uw) =a) B" , (309) 


216 


where in the second line we expanded out the definitions of Vg and Vi w. indu (u) (defined in Eq. (273)); in 
the third line we expanded out the definition of T ti) „Gndm(u)) (defined in Eq. (272)) and used the fact 


u)— 
that for all h € EM, the inner product h - ind,,(u) is equal to g} (u); in the fourth line we used the identities 


T*(8)t T” (8) = Tye and 77s) T (s)" = Ty; 


and in the fifth line is due to the following short calculation: for all fixed gw we have 
W W W W 
eS Th4Dec(gw) = 2 Th+Dec(gw) = D Th+Dec(gw) = D Tw 
h:gi(u)=gw(u)—a hgw(u)—8n(u)=a higw(u)+8n(u)=a W:8u(u)=a 


The second equality from the fact that F; is a field of characteristic 2 (so subtraction is the same as addition), 
and the third equality follows from the fact that letting h’ = h + Dec( gw), we have gy = gn + gw (in 
other words, the low-degree encoding h ++ gp is linear in h). 


Let |A) = |AUX) ® |EPR,)®™. We show that Va (mi Vi Soap lax © (T)ar (the 
A 


argument for the operators acting on registers BB’B” proceeds identically). 


aa 


+ 
Dial (Va My VA — (ar) (Va My" VE — Caf) xr) 1A) 
h 


= 2-28 F(A Va ME Vt HY) anf A) 
h 
= 2-2) (A|Va MPW VE @ (TY) gn lA) (310) 
h 


where in the last line we used that (t}”) 4”|EPR,)°™ = (7}”)g”|EPR,)°™. We now claim that 


L(A ME VE @ (af yn) 

h 

Sma E E (A Va MVE @ (2p )prlA) GU) 
hh: 


gn (U) =p (u) 


P. ,W 
= 3 (A| Va Mitea) VA 8 (T (u)=a))B”lA) 
is ack, 
= EY (AlVa Migja] VA © Vs Ma' pen) (312) 
ack, 


where the equality in Equation (312) follows from the identity in Equation (309), and the approximation in 
Equation (311) follows from the following calculation: 


EY (AV MYW v} @ (af) pA) 
uhh’: 
Sn (Ut) =Spr (u) 


=e (Elgit) z g(a) (Al Va MEM”) VE & (tW) rl) 
hen! 


u 


217 


In the last line we used that for for distinct h 4 h’, the polynomials g, 4 gw and thus by the Schwartz- 
Zippel lemma can only agree on at most md/q fraction of points u € F;’. 


Continuing from Equation (312), we use the first item of the Lemma which shows that |||A) — |@) ||? < 
O(,/6s) and equivalently |V} @ VEJA) — |p) |]? < O(/%s), so 


My ~ Wind, (u iĝ) 


Eq. (312) So) E pai TOn, 


> 1- O(65/ 2) 
where the last line comes from (308). Together with Equation (310), 
Eq. (310) < O(65/4 + md/q) . 
This concludes the proof of the second item of the lemma. O 


We now prove Theorem A. 1. 


Proof of Theorem A.1. Let Z = (p, M) be a projective strategy for O rams that succeeds with prob- 
ability at least 1 — e, where |) is a bipartite state on registers AB. As mentioned at the beginning of 
Appendix A.2, we assume that the state |Y} is padded with sufficiently many ancilla |0} qubits. Let pa be 
the following isometry mapping the register A to the registers AA’A” where A’ and A” are isomorphic to 
(C1)8M . 

Pa : |0)a +> Va(|6)a @ (EPR) 8") arar) 


where V; is the unitary acting on AA’A” from Lemma A.24. Define the isometry pg analogously. Lemma A.24 
then implies that there exists a state |AUX) on registers AA’BB’ such that 


2 
|e ® ely) — laux) @ |EPR,)°™|| < darn 
and for W € {X,Z}, 


P ,W P ,W 
pa (MET gt Sa ar ad pa (ME T ph Si a) 


where the ~ statement holds with respect to the state |AUX) @ |EPR,)°™. This shows that Fe ams is 


a self-test for the strategy 4"! with robustness dgip(¢,m,d,q). We conclude the proof by making the 
identification of registers AA’ with A’ and BB’ with B’ to be consistent with the statement of Theorem A.1. 
We also note that, when unpacking the dependence on parameters £, m, d, q, the function dap (e,m,d,q) = 
o(5;/* + md/q) has the form a(md)*(e” + q7? + 2-4) for some universal constants a > 1,0 < b < 1, 
as desired. O 


218 


References 


[Ara02] 


[AS98] 


[BFL91] 


[BSGH™ 06] 


[BSS05] 


[BSS08] 


[BVY17] 


[CHSH69] 


[CHTW04] 


[CLP15] 


[CM14] 


[Col19] 


[Con76] 


[CS17] 


PK Aravind. A simple demonstration of Bell’s theorem involving two observers and no prob- 
abilities or inequalities. arXiv preprint quant-ph/0206070, 2002. 7.2, 7.2 


Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: A new characterization of 
NP. Journal of the ACM (JACM), 45(1):70-122, 1998. 16 


László Babai, Lance Fortnow, and Carsten Lund. Non-deterministic exponential time has 
two-prover interactive protocols. Computational complexity, 1(1):3-40, 1991. 1.1, 1.1, 1.2, 
16, 7.1.1 


Eli Ben-Sasson, Oded Goldreich, Prahladh Harsha, Madhu Sudan, and Salil Vadhan. Robust 
PCPs of proximity, shorter PCPs, and applications to coding. SIAM Journal on Computing, 
36(4):889-974, 2006. 2 


Eli Ben-Sasson and Madhu Sudan. Simple PCPs with poly-log rate and query complexity. 
In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 
266-275. ACM, 2005. 2 


Eli Ben-Sasson and Madhu Sudan. Short PCPs with polylog query complexity. SIAM Journal 
on Computing, 38(2):551—607, 2008. 10.4.1 


Mohammad Bavarian, Thomas Vidick, and Henry Yuen. Hardness amplification for entangled 
games via anchoring. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory 
of Computing, pages 303-316. ACM, 2017. 2, 11, 11.1 


John F Clauser, Michael A Horne, Abner Shimony, and Richard A Holt. Proposed experiment 
to test local hidden-variable theories. Physical review letters, 23(15):880, 1969. 1.2 


Richard Cleve, Peter Hoyer, Benjamin Toner, and John Watrous. Consequences and limits of 
nonlocal strategies. In Proceedings. 19th IEEE Annual Conference on Computational Com- 
plexity, 2004., pages 236-249. IEEE, 2004. 1.2 


Valerio Capraro, Martino Lupini, and Vladimir Pestov. Introduction to sofic and hyperlinear 
groups and Connes’ embedding conjecture, volume 2136. Springer, 2015. 1.4 


Richard Cleve and Rajat Mittal. Characterization of binary constraint system games. In Inter- 
national Colloquium on Automata, Languages, and Programming, pages 320-331. Springer, 
2014. 7.2 


Andrea Coladangelo. A two-player dimension witness based on embezzlement, and an 
elementary proof of the non-closure of the set of quantum correlations. arXiv preprint 
arXiv: 1904.02350, 2019. 1.3 


Alain Connes. Classification of injective factors cases Ih, IIo, HI, A Æ 1. Annals of 
Mathematics, pages 73-115, 1976. 1, 1.3, 1.3 


Andrea Coladangelo and Jalex Stark. Robust self-testing for linear constraint system games. 
arXiv preprint arXiv:1709.09267, 2017. 7.2 


219 


[CS18] 


[DLTW08] 


[DPP19] 


[DSV15] 


[Eke91] 


[FIV Y19] 


[FL92] 


[FNT14] 


[Fril2] 


[GH16] 


[GH20] 


[Har04] 


[HS65] 


[IKM09] 


[IV12] 


Andrea Coladangelo and Jalex Stark. Unconditional separation of finite and infinite- 
dimensional quantum correlations. arXiv preprint arXiv: 1804.05116, 2018. 1.2, 1.3 


Andrew C Doherty, Yeong-Cherng Liang, Ben Toner, and Stephanie Wehner. The quantum 
moment problem and bounds on entangled multi-prover games. In 2008 23rd Annual IEEE 
Conference on Computational Complexity, pages 199-210. IEEE, 2008. 1.3, 1.4, 12.3 


Ken Dykema, Vern I Paulsen, and Jitendra Prakash. Non-closure of the set of quantum corre- 
lations via graphs. Communications in Mathematical Physics, 365(3):1125—1142, 2019. 1.3 


Irit Dinur, David Steurer, and Thomas Vidick. A parallel repetition theorem for entangled 
projection games. Computational Complexity, 24(2):201—254, 2015. 11, 11.2 


Artur K Ekert. Quantum cryptography based on bell’s theorem. Physical review letters, 
67(6):661, 1991. 1.2 


Joseph Fitzsimons, Zhengfeng Ji, Thomas Vidick, and Henry Yuen. Quantum proof systems 
for iterated exponential time, and beyond. In Proceedings of the 51st Annual ACM SIGACT 
Symposium on Theory of Computing, pages 473-480. ACM, 2019. 1.2, 1.2, 1.3, 1.3 


Uriel Feige and László Lovász. Two-prover one-round proof systems: Their power and their 
problems. In Proceedings of the twenty-fourth annual ACM symposium on Theory of comput- 
ing, pages 733-744. ACM, 1992. 1.1 


Tobias Fritz, Tim Netzer, and Andreas Thom. Can you compute the operator norm? Proceed- 
ings of the American Mathematical Society, 142(12):4265—4276, 2014. 1.3, 10, 1.4 


Tobias Fritz. Tsirelson’s problem and Kirchberg’s conjecture. Reviews in Mathematical 
Physics, 24(05):1250012, 2012. 1, 1.3, 1.4 


Isaac Goldbring and Bradd Hart. Computability and the Connes embedding problem. Bulletin 
of Symbolic Logic, 22(2):238-248, 2016. 1.3 


Isaac Goldbring and Bradd Hart. The universal theory of the hyperfinite II, factor is not 
computable. arXiv preprint arXiv:2006.05629, 2020. 1.3 


Prahladh Harsha. Robust PCPs of proximity and shorter PCPs. PhD thesis, Massachusetts 
Institute of Technology, 2004. 10.4 


Juris Hartmanis and Richard E Stearns. On the computational complexity of algorithms. 
Transactions of the American Mathematical Society, 117:285—306, 1965. 3.1, 3.1 


Tsuyoshi Ito, Hirotada Kobayashi, and Keiji Matsumoto. Oracularization and two-prover one- 
round interactive proofs against nonlocal strategies. In 2009 24th Annual IEEE Conference on 
Computational Complexity, pages 217-228. IEEE, 2009. 1.2 


Tsuyoshi Ito and Thomas Vidick. A multi-prover interactive proof for NEXP sound against en- 
tangled provers. In 20/2 IEEE 53rd Annual Symposium on Foundations of Computer Science, 
pages 243-252. IEEE, 2012. 1.2 


220 


[Ji16] 


[Ji17] 


[JNP+11] 


[INV* 20] 


[INV*21] 


[JPY 14] 


[KKM*11] 


[Kle54] 


[KPS18] 


[KV11] 


[LFKN90] 


[LJ91] 


[Mer90] 


[MNY20] 


Zhengfeng Ji. Classical verification of quantum proofs. In Proceedings of the forty-eighth 
annual ACM symposium on Theory of Computing, pages 885-898. ACM, 2016. 1.2 


Zhengfeng Ji. Compression of quantum multi-prover interactive proofs. In Proceedings of the 
49th Annual ACM SIGACT Symposium on Theory of Computing, pages 289-302. ACM, 2017. 
i sae ee 


Marius Junge, Miguel Navascues, Carlos Palazuelos, David Perez-Garcia, Volkher B Scholz, 
and Reinhard F Werner. Connes’ embedding problem and Tsirelson’s problem. Journal of 
Mathematical Physics, 52(1):012102, 2011. 1, 1.3, 1.4 


Zhengfeng Ji, Anand Natarajan, Thomas Vidick, John Wright, and Henry Yuen. Quantum 
soundness of the classical low individual degree test. arXiv preprint arXiv:2009. 12982, 2020. 
1.2, 3.6, 5.23, 7, 7.1.1, 8.4.3, 8.4.3, A.1, A.2 


Zhengfeng Ji, Anand Natarajan, Thomas Vidick, John Wright, and Henry Yuen. Quantum 
soundness of testing tensor codes. arXiv preprint arXiv:2111.08131, 2021. 7.1.1, 7.1.1, 1, 2, 
3.7-1 


Rahul Jain, Attila Pereszlényi, and Penghui Yao. A parallel repetition theorem for entangled 
two-player one-round games under product distributions. In 20/4 IEEE 29th Conference on 
Computational Complexity (CCC), pages 209-216. IEEE, 2014. 11 


Julia Kempe, Hirotada Kobayashi, Keiji Matsumoto, Ben Toner, and Thomas Vidick. En- 
tangled games are hard to approximate. SIAM Journal on Computing, 40(3):848-877, 2011. 
1.2 


Stephen C. Kleene. Introduction to Metamathematics. Journal of Symbolic Logic, 19(3):215-— 
216, 1954. 15, 35 


Se-Jin Kim, Vern Paulsen, and Christopher Schafhauser. A synchronous game for binary 
constraint systems. Journal of Mathematical Physics, 59(3):032201, 2018. 1.2, 1.4, 5.13 


Julia Kempe and Thomas Vidick. Parallel repetition of entangled games. In Proceedings of the 
43rd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2011, pages 353-362, 
2011. 3.6, A.1 


Carsten Lund, Lance Fortnow, Howard Karloff, and Noam Nisan. Algebraic methods for inter- 
active proof systems. In Proceedings of 31st Annual Symposium on Foundations of Computer 
Science, pages 2-10. IEEE, 1990. 1.1 


H.W. Lenstra Jr. Finding isomorphisms between finite fields. Mathematics of Computation, 
56(193):329-347, 1991. 3.3.2 


David Mermin. Simple unified form for the major no-hidden-variables theorems. Physical 
Review Letters, 65(27):3373, 1990. 7.2 


Hamoon Mousavi, Seyed Sajjad Nezhadi, and Henry Yuen. On the complexity of zero gap 
MIP*. arXiv preprint arXiv:2002. 10490, 2020. 1.4 


221 


[MP 13] 


[MR18] 


[NPA08] 


[NV17] 


[NV 18a] 


[NV 18b] 


[NW19] 


[Ozal3] 


[Pap94] 
[Per90] 


[PSS* 16] 


[PV 16] 


[Rog87] 


[RS96] 


[RS97] 


Gary L Mullen and Daniel Panario. Handbook of finite fields. Chapman and Hall/CRC, 2013. 
3:3; dol 


Magdalena Musat and Mikael Rørdam. Non-closure of quantum correlation matrices and fac- 
torizable channels that require infinite dimensional ancilla. arXiv preprint arXiv: 1806.10242, 
2018. 1.3 


Miguel Navascués, Stefano Pironio, and Antonio Acin. A convergent hierarchy of semidef- 
inite programs characterizing the set of quantum correlations. New Journal of Physics, 
10(7):073013, 2008. 1, 1, 1.3, 1.4, 12.3 


Anand Natarajan and Thomas Vidick. A quantum linearity test for robustly verifying entangle- 
ment. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, 
STOC 2017, pages 1003-1015, New York, NY, USA, 2017. ACM. A.4 


Anand Natarajan and Thomas Vidick. Low-degree testing for quantum states, and a quantum 
entangled games PCP for QMA. In 20/8 IEEE 59th Annual Symposium on Foundations of 
Computer Science (FOCS), pages 731-742. IEEE, 2018. 1.2, 1.2, 2,7, 7.1.1, 7.3 


Anand Natarajan and Thomas Vidick. Two-player entangled games are NP-hard. In Proceed- 
ings of the 33rd Computational Complexity Conference, page 20. Schloss Dagstuhl—Leibniz- 
Zentrum fuer Informatik, 2018. 1.2, 2, 16 


Anand Natarajan and John Wright. NEEXP C MIP*. arXiv preprint arXiv: 1904.05870v3, 
2019. 1212, 2,22, 2325, 5.2, SAB S19 522,524,32, 3:2, 5.27328. TL1 T3 Tad, 
8.1, 8.4.1, 8.4.1, 10, 10.1, 10.1, 1, 10.4 


Narutaka Ozawa. About the Connes embedding conjecture. Japanese Journal of Mathematics, 
8(1):147-183, 2013. 1.3, 1.4 


Christos Papadimitriou. Computational complexity. Addison Wesley, 1994. 3.1, 10.1, 10.2 


Asher Peres. Incompatible results of quantum measurements. Physics Letters A, 151(3- 
4):107—108, 1990. 7.2 


Vern I Paulsen, Simone Severini, Daniel Stahlke, Ivan G Todorov, and Andreas Winter. Es- 
timating quantum chromatic numbers. Journal of Functional Analysis, 270(6):2188—2222, 
2016. 1.2, 5.13 


Carlos Palazuelos and Thomas Vidick. Survey on nonlocal games and operator space theory. 
Journal of Mathematical Physics, 57(1):015220, 2016. 7 


Hartley Rogers, Jr. Theory of Recursive Functions and Effective Computability. MIT Press, 
Cambridge, MA, USA, 1987. 15 


Ronitt Rubinfeld and Madhu Sudan. Robust characterizations of polynomials with applica- 
tions to program testing. SIAM Journal on Computing, 25(2):252—271, 1996. 16 


Ran Raz and Shmuel Safra. A sub-constant error-probability low-degree test, and a sub- 
constant error-probability PCP characterization of NP. In Proceedings of the twenty-ninth 
annual ACM symposium on Theory of computing, pages 475—484, 1997. 16 


222 


[Sch80] Jacob Schwartz. Fast probabilistic algorithms for verification of polynomial identities. Journal 
of the ACM, 27(4):701-717, 1980. 3.20 


[Sha90] Adi Shamir. IP = PSPACE. In Proceedings of 31st Annual Symposium on Foundations of 
Computer Science, pages 11-15. IEEE, 1990. 1.1 


[Sho90] Victor Shoup. New algorithms for finding irreducible polynomials over finite fields. Mathe- 
matics of Computation, 54(189):435-447, 1990. 3.3.2 


[Sip12] Michael Sipser. Introduction to the Theory of Computation. Cengage Learning, 2012. 3.1 


[Slo19a] William Slofstra. The set of quantum correlations is not closed. In Forum of Mathematics, Pi, 
volume 7. Cambridge University Press, 2019. 1, 1.3, 1.4 


[Slo19b] William Slofstra. Tsirelson’s problem and an embedding theorem for groups arising from 
non-local games. Journal of the American Mathematical Society, 2019. 1.3 


[Tsi93] Boris S Tsirelson. Some results and problems on quantum bell-type inequalities. Hadronic 
Journal Supplement, 8(4):329-345, 1993. 1.3 


[Tsi06] Boris S Tsirelson. Bell inequalities and operator algebras, 2006. Problem state- 
ment from website of open problems at TU Braunschweig (2006), available at 
http://web.archive.org/web/20090414083019/http://www.imaph.tu-bs.de/qi/pr 
1, 1.3 


[Tur37] Alan M. Turing. On computable numbers, with an application to the Entscheidungsproblem. 
Proceedings of the London mathematical society, 2(1):230-265, 1937. 3.1 


[Vid16] Thomas Vidick. Three-player entangled xor games are np-hard to approximate. SIAM Journal 
on Computing, 45(3):1007—1063, 2016. 1.2, 7.1.1 


[VW16] Thomas Vidick and John Watrous. Quantum proofs. Foundations and Trends in Theoretical 
Computer Science, 11(1-2):1-215, 2016. 5.3, 12.2 


[Wan89] Charles C. Wang. An algorithm to design finite field multipliers using a self-dual normal basis. 
IEEE Transactions on Computers, 38(10):1457-1460, 1989. 3.3.2 


[Yue16] Henry Yuen. A parallel repetition theorem for all entangled games. In 43rd International 
Colloquium on Automata, Languages, and Programming (ICALP 2016). Schloss Dagstuhl- 
Leibniz-Zentrum fuer Informatik, 2016. 11.2 


[Zip79] Richard Zippel. Probabilistic algorithms for sparse polynomials. In Symbolic and Algebraic 
Computation, pages 216-226, 1979. 3.20 


223 


